Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to reproduce llava evaluation #19

Closed
jacob-hansen opened this issue Mar 20, 2024 · 7 comments
Closed

Unable to reproduce llava evaluation #19

jacob-hansen opened this issue Mar 20, 2024 · 7 comments
Labels
bug Something isn't working documentation Improvements or additions to documentation

Comments

@jacob-hansen
Copy link

jacob-hansen commented Mar 20, 2024

To Reproduce (on a linux machine with 1xA100 80GB):

  1. I first create a new environment of python 3.10 with anaconda:

conda create --name lmms python=3.10
conda activate lmms

  1. Install lmms-eval from source (when in lmms-eval repo)

pip install --no-deps -U -e .

  1. Install llava requirements

pip install -r llava_repr_requirements.txt
This Fails due to a conflict in transformer requirement:

  ERROR: Cannot install -r llava_repr_requirements.txt (line 1) and transformers>=4.36.2 because these package versions have conflicting dependencies.
  
  The conflict is caused by:
      The user requested transformers>=4.36.2
      llava 1.1.3 depends on transformers==4.31.0

  To fix this you could try to:
  1. loosen the range of package versions you've specified
  2. remove package versions to allow pip attempt to solve the dependency conflict

  ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts
  1. I then tried manually installed transformers>=4.36.2 and retried install after removing it from llava_repr_requirements.txt

pip install transformers
pip install -r llava_repr_requirements.txt

  1. Run main command

accelerate launch --num_processes=1 -m lmms_eval --model llava --model_args pretrained="liuhaotian/llava-v1.5-7b" --tasks mme --batch_size 1 --log_samples --log_samples_sufix reproduce --output_path ./logs/

Received error:
__main__.py: error: unrecognized arguments: --log_samples_sufix reproduce

Removed unknown commands (log_samples_sufix)

accelerate launch --num_processes=1 -m lmms_eval --model llava --model_args pretrained="liuhaotian/llava-v1.5-7b" --tasks mme --batch_size 1 --log_samples --output_path ./logs/

Received ValueError:
ValueError: Attempted to load model 'llava', but no model for this name found! Supported model names: qwen_vl, gpt4V, instructblip, minicpm_v

I've tried from many fresh installations. I'm am up to date with the main branch of lmms-eval (cloned it few hours ago). What should I do?

@jacob-hansen
Copy link
Author

jacob-hansen commented Mar 20, 2024

Interesting, I observe different behavior for the log_samples_suffix reproduce (does not work) and log_samples_suffix llava_v1.5_mme_mmbenchen (I got to work)

In order to get it functioning, I also had to introduce two changes to the lmms-eval > lmms_eval > models > llava.py

  1. Comment out "image_sizes=gen_kwargs["image_sizes"]," from generation args
  2. Skip padding tokens (-200)
    cont = cont.masked_fill(cont == -200, 1)
    result = self.tokenizer.batch_decode(cont, skip_special_tokens=True)
    text_outputs = [i.replace("", "").replace("", "").strip() for i in result]
  3. Hard Code device and device_map
    self._device = torch.device("cuda:0")
    self.device_map = "cuda:0"

Would anyone be able to clarify the difference between reproduce and llava_v1.5_mme_mmbenchen? What benchmarks are supported by llava_v1.5_mme_mmbenchen?

Additionally, has anyone ever run into the errors I mentioned? Wondering why I was unable to run the code as is.

@justlovebarbecue
Copy link

I almost met the same series of error as @jacob-hansen. Hope can have a clean version for llava soon.

@kcz358
Copy link
Collaborator

kcz358 commented Mar 21, 2024

Hi, @jacob-hansen , @justlovebarbecue , thank you for spotting out the issue.

It seems like current llava_requirement.txt does have some issue. Here are my way to make it correct under current version.


Solution

First cd into lmms-eval

pip install --no-deps -U -e .

Then cd into the LLaVA repo from https://github.com/haotian-liu/LLaVA?tab=readme-ov-file and do the same thing

pip install --no-deps -U -e .

This will build llava and lmms_eval without installing any dependency.

Then instead of using the current llava_requirement.txt, please use this

accelerate==0.21.0
datasets==2.16.1
evaluate==0.4.1
hf_transfer==0.1.6
Jinja2==3.1.3
numpy==1.26.4
openai==1.13.3
packaging==23.2
pandas==2.2.1
Pillow==10.2.0
protobuf==4.25.3
pycocoevalcap==1.2
pycocotools==2.0.7
pytablewriter==1.2.0
pytest==8.0.2
python_Levenshtein==0.25.0
pytz==2024.1
PyYAML==6.0.1
PyYAML==6.0.1
Requests==2.31.0
sacrebleu==2.4.0
scikit_learn==1.2.2
sentencepiece==0.1.99
setuptools==68.2.2
sglang==0.1.12
shortuuid==1.0.12
sqlitedict==2.1.0
tenacity==8.2.3
torch==2.0.1
tokenizers==0.15.2
tqdm==4.66.2
transformers==4.37.2

Save this into llava_requirement.txt and

pip install -r llava_requirement.txt

The correct environment will be installed by this requirement file. Noted that this file is generated py pipreqs and possibly there are some redundant packages but it does work.

Then you can run

accelerate launch --num_processes=8 --main_process_port 12345 -m lmms_eval --model llava --model_args pretrained="liuhaotian/llava-v1.5-7b,use_flash_attention_2=False" --tasks mme --batch_size 1 --log_samples --log_samples_suffix llava_v1.5_mme --output_path ./logs/;

Make sure use_flash_attention_2 is set to False in model_args.

Results

image

Additional Note

If you want to use flash attention, you can install it by

pip install flash-attn --no-build-isolation --no-cache-dir

But noted that if you use flash-attn 2, there will be a slight different in the result. I got
"mme_cognition_score": 355.7142857142857,
"mme_percetion_score": 1509.9907963185276,

@kcz358 kcz358 added bug Something isn't working documentation Improvements or additions to documentation labels Mar 21, 2024
@kcz358
Copy link
Collaborator

kcz358 commented Mar 21, 2024

Also, @jacob-hansen , there is one typo for the log_samples_suffix in our example script, should be log_samples_suffix instead of log_samples_sufix.

@jacob-hansen
Copy link
Author

jacob-hansen commented Mar 21, 2024

When following your new protocol, I observe no differences than my work arounds for the installation.

By specifying use_flash_attention_2=False, I no longer had to comment that part out from my code.

But I still observe a gpu and required the following changes:

self.device_map = torch.device(device) # (line 74)

This might be specific to my system though

@kcz358
Copy link
Collaborator

kcz358 commented Mar 22, 2024

@jacob-hansen , you might also need to add device_map=auto or device_map=cuda when you use single device in the model args

@Luodian
Copy link
Contributor

Luodian commented Mar 22, 2024

This change has been merged into:

#24

@Luodian Luodian closed this as completed Mar 25, 2024
Luodian pushed a commit that referenced this issue Apr 4, 2024
* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code
Luodian pushed a commit that referenced this issue Apr 4, 2024
* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code
Luodian added a commit that referenced this issue Apr 4, 2024
…del Specific Prompt. (#20)

* Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit ecb47d7
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit dc23f4b
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

    * Update dataset paths and improve user prompts

commit 5f55126484a7c9325db586d26cf2052538222804
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 5f55126484a7c9325db586d26cf2052538222804
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 18e984c
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit 0df825c9e72a06e6acb4c0bd43c2083ffe8b74c0
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit b9d9f9896993033b92346e9f47420c55b866c715
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 4256bef410e4c8d8761e0cd0d79ac5e57b97651b
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit 18e984c
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 0c8a3919885b8fe2880bb2892f7a619d060012d1
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit d2bc7c92ac61179b8c4031e11bc31970355252f6
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit c78fa29cd0d161641ee05db57bd39314b998c8c7
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 397f0906968fd8ba04b883469b96217737c43e09
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit 52a7ea6c7599adeec2ac2787f500e215ce47cf79
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit f706b2aaf9b288c582611191a1841b58feaeb741
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit 18e984c
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Add download configuration for dataset

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* add datasets

* Update gpt_eval_model_name in mathvista.yaml

* Merge commit '0d620f98b49f8204d02633f209eedd5d8b7a1f7c'

* Update pyproject.toml with dependencies and URLs

* Squashed commit of the following:

commit 8b600f5
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Sat Jan 27 13:56:37 2024 +0800

    Dev/add chartqa and ai2d (#23)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * Add 'submissions/' directory to .gitignore

    * Add Python setup and Black version installation workflow
    Refactor ContextSampler class in samplers.py
    Remove unnecessary line in DecontaminationFilter class
    Update dependencies in pyproject.toml

    * Refactor code in ContextSampler class

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

* Refactor image processing and submission file path

* Refactor directory creation logic in cli_evaluate_single function

* Update dataset path and test split in vqav2.yaml

* Remove "total" column from cap_details_columns DataFrame

* Add retry logic for dataset download

* Add 'tenacity' to dependencies in pyproject.toml

* Refactor code in ContextSampler class

* Update Black version and configuration, and improve code readability in ContextSampler

* Update Black version and line length

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
Co-authored-by: Fanyi Pu <FPU001@e.ntu.edu.sg>
Luodian added a commit that referenced this issue Apr 4, 2024
…del Specific Prompt. (#20)

* Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 2626383d99b5eac59d531ca0f293df960570c524
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 842fbc6
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4bf0504
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

    * Update dataset paths and improve user prompts

commit 520c7a2cafe60810aca79df814ce6829d4576032
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 2626383d99b5eac59d531ca0f293df960570c524
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 520c7a2cafe60810aca79df814ce6829d4576032
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit b13a805
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit a2cc9303dc72e4d53983bb56e54a32e977c3e270
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit 35e87e7c7a480d005abf607c2527a35457d92311
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 89755323596b85208ed33aa88c296604a39af6eb
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit b13a805
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 0b0d30dfb247c5f0b7b68398b9e9fcde74cf7fa2
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit e273f9cbd91540df86bdbc652bff88a847bd0d2d
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit e84126aaaf8a07bd371a0571a914ccbcd3697f20
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 110deab53dc1a2fd349b1872cd261b69074c5fa8
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit 0fa3e0c40075997ea80ed976bdee9615f17d3ece
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit 2aaca579120def99860f90054233f3358950fa66
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit b13a805
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Add download configuration for dataset

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* add datasets

* Update gpt_eval_model_name in mathvista.yaml

* Merge commit '817eb057bcb61226b33d3ac3c8def01c36c90f96'

* Update pyproject.toml with dependencies and URLs

* Squashed commit of the following:

commit f253968
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Sat Jan 27 13:56:37 2024 +0800

    Dev/add chartqa and ai2d (#23)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * Add 'submissions/' directory to .gitignore

    * Add Python setup and Black version installation workflow
    Refactor ContextSampler class in samplers.py
    Remove unnecessary line in DecontaminationFilter class
    Update dependencies in pyproject.toml

    * Refactor code in ContextSampler class

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

* Refactor image processing and submission file path

* Refactor directory creation logic in cli_evaluate_single function

* Update dataset path and test split in vqav2.yaml

* Remove "total" column from cap_details_columns DataFrame

* Add retry logic for dataset download

* Add 'tenacity' to dependencies in pyproject.toml

* Refactor code in ContextSampler class

* Update Black version and configuration, and improve code readability in ContextSampler

* Update Black version and line length

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
Co-authored-by: Fanyi Pu <FPU001@e.ntu.edu.sg>
Luodian added a commit that referenced this issue Apr 4, 2024
* Update tqdm progress bar position

* Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384'

* Squashed commit of the following:

commit 18e984c
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

commit ecb47d7
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit dc23f4b
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

    * Update dataset paths and improve user prompts

commit 95ef3ea
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* remove useless output file

* Update dataset path in vqav2.yaml

* Squashed commit of the following:

commit 75bb7043ea5a533ab6351fc0f5ab055e86106423
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:56:45 2024 +0800

    Black lint

commit 6635a8aa34cfbd3c7a4afb6fcd214a7283ce01cb
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:47 2024 +0800

    Solve doc_iterator_for_counting crashing issue

commit 080f42b88ea8acacd527b8d67b84ba1d7d135b03
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:13 2024 +0800

    Exclude train in refcoco/+/g config

commit 4da84069c08c95e49e8ab0e64a1e103ff7ac8730
Merge: 6a1ae69 697a438
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:13 2024 +0000

    Merge branch 'dev/bli_add_datasets' of https://github.com/EvolvingLMMs-Lab/lmms-eval into dev/bli_add_datasets

commit 6a1ae69923d79ae32a001edac38206b605274ec3
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:06 2024 +0000

    Fix file path and raise error if config file does not exist

commit 697a4387827ceeec3e393237dd1baa217c714c88
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 00:47:24 2024 +0800

    Fix tasks issue for nocaps, refcoco/+/g

commit 47e40437126d39a5f062c9a33b4de426c1a29804
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 10:09:43 2024 +0000

    Remove unused files and update task configuration

commit 9976eb8e9ed03c8613725fdbd822ef5d8cf70e47
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:43:56 2024 +0000

    Add submission file for coco, flickr30k, nocaps, and textcaps tasks

commit 95f97a69faa6129676e89eee14960fcfe2076b7c
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:32:54 2024 +0000

    Refactor get_task_dict function to handle nested groups

commit 3b79ee842b2488714baf92ab34528ef77989d392
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:13:46 2024 +0000

    Fix bug in login functionality

    Refactor code for better performance

    Add new feature for user authentication

    Update UI layout for improved user experience

    Fix typo in variable name

    Optimize database queries for faster response time

    Add error handling for edge cases

    Update dependencies to latest versions

    Remove unused code

    Improve code readability and maintainability

commit f5c353f2ce93a2d96add4312b695b57432f68cbb
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 17:07:20 2024 +0800

    Fix cli itself can not run with config file

commit 9a68fec37be74cfe8d4a73390bc83edee147ae24
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:09:04 2024 +0000

    Squashed commit of the following:

    commit 18e984c
    Author: Zhang Peiyuan <a1286225768@gmail.com>
    Date:   Thu Jan 25 17:08:25 2024 +0800

        add model specific prompt and gen kwargs in sqa (#19)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

commit 93f847c5851fd246716367935d6b807b17d53949
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:02:57 2024 +0000

    Squashed commit of the following:

    commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit 63739fc6fa0a462d807ae81de0db0173102de584
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit edcc752f97ea3845cefad56624e5d2855066f680
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 5f55126484a7c9325db586d26cf2052538222804
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit fa4ad4404e26d8924f55208746dbb9143b464011
Merge: 22c3adf 1d3fdd4
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:43:15 2024 +0000

    Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384' into dev/bli_add_datasets

commit 22c3adfd0645acc23b6d7c06b487f4ffd47666c4
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:52 2024 +0000

    Squashed commit of the following:

    commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit 63739fc6fa0a462d807ae81de0db0173102de584
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit edcc752f97ea3845cefad56624e5d2855066f680
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit ecb47d7
    Author: kcz358 <92624596+kcz358@users.noreply.github.com>
    Date:   Thu Jan 25 09:47:31 2024 +0800

        [Dataset] Add flickr30k (#18)

        * Add flickr30k support

        * Black lint

        * Align prompt with NoCaps

    commit dc23f4b
    Author: Li Bo <drluodian@gmail.com>
    Date:   Wed Jan 24 22:10:14 2024 +0800

        [Datasets] modify NoCaps data path and prompts (#17)

        * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

        * Update dataset paths and improve user prompts

    commit 5f55126484a7c9325db586d26cf2052538222804
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit 4c712336b6f7438e717a865910bb241e413a4688
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:11 2024 +0000

    Add coco_val and coco_test tasks to coco.yaml

commit b5547126c855927fd4dc8384211e4aceee40870f
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 04:58:28 2024 +0000

    Update dataset_path in flickr30k.yaml

commit f786f61e2559f082072f21aa9030e2080ddaf809
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:12:25 2024 +0000

    Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384'

commit 796a011000e0df90f66f8e80cb34dc2318ae9ac8
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:10:18 2024 +0000

    Add submission folder and update file paths for storing prediction results

commit ecb47d7
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit dc23f4b
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

    * Update dataset paths and improve user prompts

commit 118744c63eb2d9724571d85fbbd85fcc9ad05b59
Merge: c6370bf 51f2eaa
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:07 2024 +0800

    Merge branch 'main' into dev/bli_add_datasets

commit c6370bff65903681f00cf3d07111d8e15a57b619
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 14:08:06 2024 +0000

    Update dataset paths and improve user prompts

commit 810daf458fa94cb3ec2b4a6cc5ecb1e656a24002
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 11:52:33 2024 +0000

    Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

commit 95ef3ea
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* Fix bug in login functionality

* create vqav2_val

* Update vqav2_test.yaml

* Update vqav2_test.yaml

* Update vqav2_val.yaml

---------

Co-authored-by: Li Bo <drluodian@gmail.com>
Luodian added a commit that referenced this issue Apr 4, 2024
* Update tqdm progress bar position

* Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

* Squashed commit of the following:

commit b13a805
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

commit 842fbc6
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4bf0504
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

    * Update dataset paths and improve user prompts

commit 767f7e2
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* remove useless output file

* Update dataset path in vqav2.yaml

* Squashed commit of the following:

commit eeb2b9827502f044ef67d8440f53124baf219ba3
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:56:45 2024 +0800

    Black lint

commit 1ce9f0b37e4bc5e6ff5fbfcd23fd339eb14974ae
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:47 2024 +0800

    Solve doc_iterator_for_counting crashing issue

commit e12b3bb41ed4f51540cfac84e5e96d15777540c4
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:13 2024 +0800

    Exclude train in refcoco/+/g config

commit 42c56f82bc4ccae12e19e76d09d7e525ca9ef2f4
Merge: 6a1ae69 697a438
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:13 2024 +0000

    Merge branch 'dev/bli_add_datasets' of https://github.com/EvolvingLMMs-Lab/lmms-eval into dev/bli_add_datasets

commit aed08303fe87808986d206540a0c0ee6d8764988
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:06 2024 +0000

    Fix file path and raise error if config file does not exist

commit a105386613c443d9e740c89725cbd1281bbdfef6
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 00:47:24 2024 +0800

    Fix tasks issue for nocaps, refcoco/+/g

commit 21c8119e377760f44c769bed2528d863a8f4333b
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 10:09:43 2024 +0000

    Remove unused files and update task configuration

commit 0ccb2629c2aacdb297b7cf0c9c2bcfa386bb7582
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:43:56 2024 +0000

    Add submission file for coco, flickr30k, nocaps, and textcaps tasks

commit 5365e13e93c702a1e0e259ee6a08d6a427d72470
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:32:54 2024 +0000

    Refactor get_task_dict function to handle nested groups

commit 6773348c807bcfa1b09ceffc90c75e15cad908f7
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:13:46 2024 +0000

    Fix bug in login functionality

    Refactor code for better performance

    Add new feature for user authentication

    Update UI layout for improved user experience

    Fix typo in variable name

    Optimize database queries for faster response time

    Add error handling for edge cases

    Update dependencies to latest versions

    Remove unused code

    Improve code readability and maintainability

commit 31140f9c87dea89ca94c94bc850e3a8d43e5f8b4
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 17:07:20 2024 +0800

    Fix cli itself can not run with config file

commit df1bad47f6ed13f94848d2bee29b28e00c2384b2
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:09:04 2024 +0000

    Squashed commit of the following:

    commit b13a805
    Author: Zhang Peiyuan <a1286225768@gmail.com>
    Date:   Thu Jan 25 17:08:25 2024 +0800

        add model specific prompt and gen kwargs in sqa (#19)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

commit 06383aa4a5ff59db52fc8d584f3086efd88b7e74
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:02:57 2024 +0000

    Squashed commit of the following:

    commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit 2626383d99b5eac59d531ca0f293df960570c524
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 520c7a2cafe60810aca79df814ce6829d4576032
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit 7a71fd6022ee5985100dda38b94956595cec77a5
Merge: 22c3adf 1d3fdd4
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:43:15 2024 +0000

    Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e' into dev/bli_add_datasets

commit 6870cba13cb54976480c1d5e8d97602c246f881b
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:52 2024 +0000

    Squashed commit of the following:

    commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit 2626383d99b5eac59d531ca0f293df960570c524
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 842fbc6
    Author: kcz358 <92624596+kcz358@users.noreply.github.com>
    Date:   Thu Jan 25 09:47:31 2024 +0800

        [Dataset] Add flickr30k (#18)

        * Add flickr30k support

        * Black lint

        * Align prompt with NoCaps

    commit 4bf0504
    Author: Li Bo <drluodian@gmail.com>
    Date:   Wed Jan 24 22:10:14 2024 +0800

        [Datasets] modify NoCaps data path and prompts (#17)

        * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

        * Update dataset paths and improve user prompts

    commit 520c7a2cafe60810aca79df814ce6829d4576032
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit b40d522b6bf483ebdfbf5facd4573de0cf8a93f6
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:11 2024 +0000

    Add coco_val and coco_test tasks to coco.yaml

commit 5bf643f73d06f1e540897b753450352bb92fd9ec
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 04:58:28 2024 +0000

    Update dataset_path in flickr30k.yaml

commit 95f110f0eef5196205bc501367e3642c57cc7a17
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:12:25 2024 +0000

    Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

commit c844ae49b18c1334711832208b0359c9439fe1c0
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:10:18 2024 +0000

    Add submission folder and update file paths for storing prediction results

commit 842fbc6
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4bf0504
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

    * Update dataset paths and improve user prompts

commit f0446227f0dd93651e9d6c06254bbf5212ede2dd
Merge: c6370bf 51f2eaa
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:07 2024 +0800

    Merge branch 'main' into dev/bli_add_datasets

commit 1e1f6cfccba758dc606fa4217102518fab73c936
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 14:08:06 2024 +0000

    Update dataset paths and improve user prompts

commit 966933754b9e5179995b3ab41d746603e13e75c6
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 11:52:33 2024 +0000

    Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

commit 767f7e2
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* Fix bug in login functionality

* create vqav2_val

* Update vqav2_test.yaml

* Update vqav2_test.yaml

* Update vqav2_val.yaml

---------

Co-authored-by: Li Bo <drluodian@gmail.com>
Luodian added a commit that referenced this issue Apr 4, 2024
* Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit ecb47d7
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit dc23f4b
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

    * Update dataset paths and improve user prompts

commit 5f55126484a7c9325db586d26cf2052538222804
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 5f55126484a7c9325db586d26cf2052538222804
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 18e984c
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit 0df825c9e72a06e6acb4c0bd43c2083ffe8b74c0
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit b9d9f9896993033b92346e9f47420c55b866c715
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 4256bef410e4c8d8761e0cd0d79ac5e57b97651b
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit 18e984c
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 0c8a3919885b8fe2880bb2892f7a619d060012d1
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit d2bc7c92ac61179b8c4031e11bc31970355252f6
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit c78fa29cd0d161641ee05db57bd39314b998c8c7
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 397f0906968fd8ba04b883469b96217737c43e09
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit 52a7ea6c7599adeec2ac2787f500e215ce47cf79
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit f706b2aaf9b288c582611191a1841b58feaeb741
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit 18e984c
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Update lmms_eval/evaluator.py and lmms_eval/tasks/vizwizvqa/utils.py

* vizwiz-val

* Update utils.py

* Update vizwizvqa.yaml

---------

Co-authored-by: Bo Li <drluodian@gmail.com>
Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
Luodian added a commit that referenced this issue Apr 4, 2024
* Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 2626383d99b5eac59d531ca0f293df960570c524
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 842fbc6
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4bf0504
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

    * Update dataset paths and improve user prompts

commit 520c7a2cafe60810aca79df814ce6829d4576032
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 2626383d99b5eac59d531ca0f293df960570c524
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 520c7a2cafe60810aca79df814ce6829d4576032
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit b13a805
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit a2cc9303dc72e4d53983bb56e54a32e977c3e270
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit 35e87e7c7a480d005abf607c2527a35457d92311
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 89755323596b85208ed33aa88c296604a39af6eb
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit b13a805
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 0b0d30dfb247c5f0b7b68398b9e9fcde74cf7fa2
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit e273f9cbd91540df86bdbc652bff88a847bd0d2d
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit e84126aaaf8a07bd371a0571a914ccbcd3697f20
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 110deab53dc1a2fd349b1872cd261b69074c5fa8
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit 0fa3e0c40075997ea80ed976bdee9615f17d3ece
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit 2aaca579120def99860f90054233f3358950fa66
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit b13a805
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Update lmms_eval/evaluator.py and lmms_eval/tasks/vizwizvqa/utils.py

* vizwiz-val

* Update utils.py

* Update vizwizvqa.yaml

---------

Co-authored-by: Bo Li <drluodian@gmail.com>
Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
Luodian added a commit that referenced this issue Apr 16, 2024
* Add COCO, RefCOCO, RefCOCO+, RefCOCOg (#5)

* Update author name and email in pyproject.toml

* add mmvet and try to modify llava arch

* Add coco, refcoco support

* Fix doc_to_visual error

* Fix segmentation mask error

* Add refcoco+, refcocog

* Remove debug code

* black lint

* Remove unused code and scripts

* Fix group stderr N/A error between str and int

* Fix letter case issue

* Update lmms_eval tasks and utils

* Fix coco test_split name

* Add llava-bench-in-the-wild support

* Black codestyle, lint

* Add COCO evaluation metric

* Add refcoco, refcocog, refcoco+ evaluation kit

* Add llava bench coco support

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* VQAv2 eval (#4)

* vqav2

* Add vqav2_process_results function and update vqav2_doc_to_text function

* Implement vqav2_process_results function to return exact match score

* Refactor fewshot_docs() to use config.fewshot_config

* Refactor Task class to handle fewshot_docs when training and validation docs are not available

* Add answer processing logic in vqav2_process_results function

* Refactor vqav2_process_results function and add submission aggregation

* Add vqav2_aggreate_submissions function to utils.py

* textvqa

* Refactor answer processing in textvqa_process_results() function

* textvqa eval

* Update dataset path and modify textvqa_doc_to_text function

* Capitalize the question in textvqa_doc_to_text function

* Update textvqa.yaml and utils.py

* Fix formatting issues in lmms_eval/api/task.py, lmms_eval/tasks/gqa/utils.py, lmms_eval/tasks/textvqa/utils.py, and lmms_eval/tasks/vqav2/utils.py

---------

Co-authored-by: Li Bo <drluodian@gmail.com>

* [Big Changes] add LLaVA-1.6, MMVet, LLaVA-W, POPE, and many other changes on logs, model args. (#7)

* Update author name and email in pyproject.toml

* add mmvet and try to modify llava arch

* black lint

* Remove unused code and scripts

* Update lmms_eval tasks and utils

* Update LMMS-Eval dependencies and configurations

* Squashed commit of the following:

commit 209f3904f33210bec0b4b146e96fcbd67a4e1541
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Wed Jan 17 20:27:13 2024 +0800

    Add COCO, RefCOCO, RefCOCO+, RefCOCOg (#5)

    * Update author name and email in pyproject.toml

    * add mmvet and try to modify llava arch

    * Add coco, refcoco support

    * Fix doc_to_visual error

    * Fix segmentation mask error

    * Add refcoco+, refcocog

    * Remove debug code

    * black lint

    * Remove unused code and scripts

    * Fix group stderr N/A error between str and int

    * Fix letter case issue

    * Update lmms_eval tasks and utils

    * Fix coco test_split name

    * Add llava-bench-in-the-wild support

    * Black codestyle, lint

    * Add COCO evaluation metric

    * Add refcoco, refcocog, refcoco+ evaluation kit

    * Add llava bench coco support

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

commit f9e48cec5493010a363b446b81a335ef1484e42f
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Wed Jan 17 20:26:58 2024 +0800

    Update utils.py (#6)

* Fix logging issue and remove unnecessary whitespace

* Add openai and pycocoevalcap dependencies

* Fix device mapping issue in Llava constructor

* Add support for truncating context in generation

* Update Llava model and evaluation configuration

* Update YAML configuration files

* Update YAML configuration files

* add otterhd and gemini models

* Add support for custom image aspect ratio in Llava model

* Add dataset_kwargs and max_gen_toks to YAML files

* Fix log_samples suffix typo and use hash for output name

* Refactor LMMS evaluation code and update LLAVA model properties

* matched response for mistral-llava

* Refactor logging in llava_aggregation function

* Print evaluation statistics instead of logging them

* Fix logging information in llava_aggregation function

* Add new models and dataset_kwargs for COCO tasks

* Update truncate_context parameter in Llava class constructor

* Update dataset_kwargs in YAML files

* Remove issue type tags from issue and pull request templates

* add mmvet and try to modify llava arch

* black lint

* Update lmms_eval tasks and utils

* Update LMMS-Eval dependencies and configurations

* Squashed commit of the following:

commit 209f3904f33210bec0b4b146e96fcbd67a4e1541
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Wed Jan 17 20:27:13 2024 +0800

    Add COCO, RefCOCO, RefCOCO+, RefCOCOg (#5)

    * Update author name and email in pyproject.toml

    * add mmvet and try to modify llava arch

    * Add coco, refcoco support

    * Fix doc_to_visual error

    * Fix segmentation mask error

    * Add refcoco+, refcocog

    * Remove debug code

    * black lint

    * Remove unused code and scripts

    * Fix group stderr N/A error between str and int

    * Fix letter case issue

    * Update lmms_eval tasks and utils

    * Fix coco test_split name

    * Add llava-bench-in-the-wild support

    * Black codestyle, lint

    * Add COCO evaluation metric

    * Add refcoco, refcocog, refcoco+ evaluation kit

    * Add llava bench coco support

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

commit f9e48cec5493010a363b446b81a335ef1484e42f
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Wed Jan 17 20:26:58 2024 +0800

    Update utils.py (#6)

* Fix logging issue and remove unnecessary whitespace

* Add openai and pycocoevalcap dependencies

* Fix device mapping issue in Llava constructor

* Add support for truncating context in generation

* Update Llava model and evaluation configuration

* Update YAML configuration files

* Update YAML configuration files

* add otterhd and gemini models

* Add support for custom image aspect ratio in Llava model

* Add dataset_kwargs and max_gen_toks to YAML files

* Fix log_samples suffix typo and use hash for output name

* Refactor LMMS evaluation code and update LLAVA model properties

* matched response for mistral-llava

* Refactor logging in llava_aggregation function

* Print evaluation statistics instead of logging them

* Fix logging information in llava_aggregation function

* Add new models and dataset_kwargs for COCO tasks

* Update truncate_context parameter in Llava class constructor

* Update dataset_kwargs in YAML files

* Remove issue type tags from issue and pull request templates

* Refactor pope utils functions

* Update transformers dependency to version 4.36.2

* Revise llava-in-the-wild prompt for align

* Add default values for gen_kwargs in Llava class

* Fix formatting issues and import pdb for debugging

* Remove pdb.set_trace() and update default value for max_new_tokens

* Add llava loglikelihood

* Fix formatting and indentation issues in lmms_eval/api/metrics.py and lmms_eval/models/llava.py

* Update function to handle edge cases

This commit updates the function to handle edge cases, improving the overall reliability and robustness of the code.

* Update black version in pre-commit config

* Remove duplicate lines in gqa

* Another way to solve memory issue

* Handle exception in model generation

* Refactor pope_aggregate_results to use "score" key instead of "pope_accuracy"

* Update pope metrics aggregation functions

* Add model_to_prompt in pope.yaml

* Update pope.yaml configuration

* Refactor code to simplify construct_requests call

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>

* Add datetime to output name in cli_evaluate function

Add get_datetime_str function to utils.py

* Refactor pope_aggregate_f1_score function

* Fix datetime format in get_datetime_str function

* Update JSON dump indentation in cli_evaluate function

* Add datetime to output name in cli_evaluate function (#10)

* Revert "Add datetime to output name in cli_evaluate function"

This reverts commit ef26f78c46b50d8769a4fb6990b909162c2881c3.

* Add datetime to output name in cli_evaluate function

* [Datasets] Added POPE and Aligned. (#11)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* [Dataset] Add SEED-Bench, TextCaps, NoCaps (#12)

* Change coco from print to logger

* Add llava loglikelihood

* Add Nocaps support

* Fix pass through function

* Add textcaps support

* Fix textcaps eval image_id

* Add seedbench support

* Add seedbench ppl evaluation

* black lint

* [Datasets] Add four internal evaluation datasets (#13)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* Remove unused variable in mmvet_process_results function

* Remove unused imports in utils.py

* Refactor get_chat_response function to include retries for API requests

* Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

* Update prompt variable in lmms_eval tasks

* Refactor output_name variable in cli_evaluate function

* Fix logging message in mmvet_process_results function

* Update sleep time in get_chat_response function

* Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

* Refactor get_eval function to include retries

* Add token parameter to load_dataset function in gqa_doc_to_visual

* Refactor llava_process_results and llava_aggregation functions

* [Datasets] Add four internal evaluation datasets (#13)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* Remove unused variable in mmvet_process_results function

* Remove unused imports in utils.py

* Refactor get_chat_response function to include retries for API requests

* Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

* Update prompt variable in lmms_eval tasks

* Refactor output_name variable in cli_evaluate function

* Fix logging message in mmvet_process_results function

* Update sleep time in get_chat_response function

* Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

* Refactor get_eval function to include retries

* Add token parameter to load_dataset function in gqa_doc_to_visual

* Refactor llava_process_results and llava_aggregation functions

* add mmmu (#15)

* add mmme

* black

* add mmmu (#15)

* add mmme

* black

* [Memory issue] Solve memory issue for building context (#14)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* Remove unused variable in mmvet_process_results function

* Remove unused imports in utils.py

* Refactor get_chat_response function to include retries for API requests

* Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

* Update prompt variable in lmms_eval tasks

* Refactor output_name variable in cli_evaluate function

* Fix logging message in mmvet_process_results function

* Update sleep time in get_chat_response function

* Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

* Refactor get_eval function to include retries

* Add token parameter to load_dataset function in gqa_doc_to_visual

* Refactor llava_process_results and llava_aggregation functions

* Remove unused function llava_aggregation

* Refractor llava-bench aggregation code

* Add logs and scripts to .gitignore, and set image_aspect_ratio to original in scienceqa.yaml

* Update generation parameters in scienceqa.yaml

* Solve memory issue for building context

* Solved gather result error

* Update lmms_eval scienceqa_img config

* Fixed nocaps store results

* Revise seedbench prompt

* Squashed commit of the following:

commit c3cc24a89415aeccad31ccbb10642af677cd6fe5
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Wed Jan 24 14:07:36 2024 +0800

    add mmmu (#15)

    * add mmme

    * black

commit 0dbc5d16c4f45ebea8def5f0bc1a36fcd93f9a05
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 10:00:33 2024 +0800

    [Datasets] Add four internal evaluation datasets (#13)

    * Update generation_kwargs in pope.yaml

    * Update pope_doc_to_text function

    * Remove unused variable in mmvet_process_results function

    * Remove unused imports in utils.py

    * Refactor get_chat_response function to include retries for API requests

    * Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

    * Update prompt variable in lmms_eval tasks

    * Refactor output_name variable in cli_evaluate function

    * Fix logging message in mmvet_process_results function

    * Update sleep time in get_chat_response function

    * Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

    * Refactor get_eval function to include retries

    * Add token parameter to load_dataset function in gqa_doc_to_visual

    * Refactor llava_process_results and llava_aggregation functions

commit fec494dbe5971e8fa5a886b191a4781be3ce7a6f
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Tue Jan 23 19:17:40 2024 +0800

    [Dataset] Add SEED-Bench, TextCaps, NoCaps (#12)

    * Change coco from print to logger

    * Add llava loglikelihood

    * Add Nocaps support

    * Fix pass through function

    * Add textcaps support

    * Fix textcaps eval image_id

    * Add seedbench support

    * Add seedbench ppl evaluation

    * black lint

commit 4c3c2c63a681f29c537c2467957de1a90568748d
Author: Li Bo <drluodian@gmail.com>
Date:   Tue Jan 23 19:17:12 2024 +0800

    [Datasets] Added POPE and Aligned. (#11)

    * Update generation_kwargs in pope.yaml

    * Update pope_doc_to_text function

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* [Memory issue] Solve memory issue for building context (#14)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* Remove unused variable in mmvet_process_results function

* Remove unused imports in utils.py

* Refactor get_chat_response function to include retries for API requests

* Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

* Update prompt variable in lmms_eval tasks

* Refactor output_name variable in cli_evaluate function

* Fix logging message in mmvet_process_results function

* Update sleep time in get_chat_response function

* Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

* Refactor get_eval function to include retries

* Add token parameter to load_dataset function in gqa_doc_to_visual

* Refactor llava_process_results and llava_aggregation functions

* Remove unused function llava_aggregation

* Refractor llava-bench aggregation code

* Add logs and scripts to .gitignore, and set image_aspect_ratio to original in scienceqa.yaml

* Update generation parameters in scienceqa.yaml

* Solve memory issue for building context

* Solved gather result error

* Update lmms_eval scienceqa_img config

* Fixed nocaps store results

* Revise seedbench prompt

* Squashed commit of the following:

commit 290126e6a269db4cca9b3544bd017d6c17012793
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Wed Jan 24 14:07:36 2024 +0800

    add mmmu (#15)

    * add mmme

    * black

commit 8b0227cd7b2602d096d773a01b2199d1f4110f22
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 10:00:33 2024 +0800

    [Datasets] Add four internal evaluation datasets (#13)

    * Update generation_kwargs in pope.yaml

    * Update pope_doc_to_text function

    * Remove unused variable in mmvet_process_results function

    * Remove unused imports in utils.py

    * Refactor get_chat_response function to include retries for API requests

    * Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

    * Update prompt variable in lmms_eval tasks

    * Refactor output_name variable in cli_evaluate function

    * Fix logging message in mmvet_process_results function

    * Update sleep time in get_chat_response function

    * Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

    * Refactor get_eval function to include retries

    * Add token parameter to load_dataset function in gqa_doc_to_visual

    * Refactor llava_process_results and llava_aggregation functions

commit fec494dbe5971e8fa5a886b191a4781be3ce7a6f
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Tue Jan 23 19:17:40 2024 +0800

    [Dataset] Add SEED-Bench, TextCaps, NoCaps (#12)

    * Change coco from print to logger

    * Add llava loglikelihood

    * Add Nocaps support

    * Fix pass through function

    * Add textcaps support

    * Fix textcaps eval image_id

    * Add seedbench support

    * Add seedbench ppl evaluation

    * black lint

commit 4c3c2c63a681f29c537c2467957de1a90568748d
Author: Li Bo <drluodian@gmail.com>
Date:   Tue Jan 23 19:17:12 2024 +0800

    [Datasets] Added POPE and Aligned. (#11)

    * Update generation_kwargs in pope.yaml

    * Update pope_doc_to_text function

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* Add output path file naming convention (#16)

Update datetime format in get_datetime_str() function

* Add output path file naming convention (#16)

Update datetime format in get_datetime_str() function

* [Datasets] modify NoCaps data path and prompts (#17)

* Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

* Update dataset paths and improve user prompts

* [Datasets] modify NoCaps data path and prompts (#17)

* Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

* Update dataset paths and improve user prompts

* [Dataset] Add flickr30k (#18)

* Add flickr30k support

* Black lint

* Align prompt with NoCaps

* [Dataset] Add flickr30k (#18)

* Add flickr30k support

* Black lint

* Align prompt with NoCaps

* add model specific prompt and gen kwargs in sqa (#19)

* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* add model specific prompt and gen kwargs in sqa (#19)

* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* Dev/add chartqa and ai2d (#23)

* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* add chartqa

* black

* add ai2d

* black

* update chartqa

* blacl

* update ai2d dataset

* black

* Add 'submissions/' directory to .gitignore

* Add Python setup and Black version installation workflow
Refactor ContextSampler class in samplers.py
Remove unnecessary line in DecontaminationFilter class
Update dependencies in pyproject.toml

* Refactor code in ContextSampler class

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* Dev/add chartqa and ai2d (#23)

* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* add chartqa

* black

* add ai2d

* black

* update chartqa

* blacl

* update ai2d dataset

* black

* Add 'submissions/' directory to .gitignore

* Add Python setup and Black version installation workflow
Refactor ContextSampler class in samplers.py
Remove unnecessary line in DecontaminationFilter class
Update dependencies in pyproject.toml

* Refactor code in ContextSampler class

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* [Datasets] Changes for Flickr30K and NoCaps, also merged Peiyuan's Model Specific Prompt. (#20)

* Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit ecb47d73d6e000b472be6c5c0cdc9413c7734384
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit dc23f4b42b1dd60b41904d7ddbee1412d6851077
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

    * Update dataset paths and improve user prompts

commit 5f55126484a7c9325db586d26cf2052538222804
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 5f55126484a7c9325db586d26cf2052538222804
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 18e984cfe173390843c73048a931baa17800f918
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit 0df825c9e72a06e6acb4c0bd43c2083ffe8b74c0
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit b9d9f9896993033b92346e9f47420c55b866c715
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 4256bef410e4c8d8761e0cd0d79ac5e57b97651b
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit 18e984cfe173390843c73048a931baa17800f918
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 0c8a3919885b8fe2880bb2892f7a619d060012d1
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit d2bc7c92ac61179b8c4031e11bc31970355252f6
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit c78fa29cd0d161641ee05db57bd39314b998c8c7
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 397f0906968fd8ba04b883469b96217737c43e09
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit 52a7ea6c7599adeec2ac2787f500e215ce47cf79
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit f706b2aaf9b288c582611191a1841b58feaeb741
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit 18e984cfe173390843c73048a931baa17800f918
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Add download configuration for dataset

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* add datasets

* Update gpt_eval_model_name in mathvista.yaml

* Merge commit '0d620f98b49f8204d02633f209eedd5d8b7a1f7c'

* Update pyproject.toml with dependencies and URLs

* Squashed commit of the following:

commit 8b600f55b6cf5627504c407871539db59f6085a3
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Sat Jan 27 13:56:37 2024 +0800

    Dev/add chartqa and ai2d (#23)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * Add 'submissions/' directory to .gitignore

    * Add Python setup and Black version installation workflow
    Refactor ContextSampler class in samplers.py
    Remove unnecessary line in DecontaminationFilter class
    Update dependencies in pyproject.toml

    * Refactor code in ContextSampler class

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

* Refactor image processing and submission file path

* Refactor directory creation logic in cli_evaluate_single function

* Update dataset path and test split in vqav2.yaml

* Remove "total" column from cap_details_columns DataFrame

* Add retry logic for dataset download

* Add 'tenacity' to dependencies in pyproject.toml

* Refactor code in ContextSampler class

* Update Black version and configuration, and improve code readability in ContextSampler

* Update Black version and line length

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
Co-authored-by: Fanyi Pu <FPU001@e.ntu.edu.sg>

* [Datasets] Changes for Flickr30K and NoCaps, also merged Peiyuan's Model Specific Prompt. (#20)

* Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 2626383d99b5eac59d531ca0f293df960570c524
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 842fbc6f2da7d9a118adf9ec27c3d8542d74168e
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4bf0504fabc3b62f356c467b2fd1119083d27313
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

    * Update dataset paths and improve user prompts

commit 520c7a2cafe60810aca79df814ce6829d4576032
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 2626383d99b5eac59d531ca0f293df960570c524
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 520c7a2cafe60810aca79df814ce6829d4576032
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit a2cc9303dc72e4d53983bb56e54a32e977c3e270
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit 35e87e7c7a480d005abf607c2527a35457d92311
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 89755323596b85208ed33aa88c296604a39af6eb
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 0b0d30dfb247c5f0b7b68398b9e9fcde74cf7fa2
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit e273f9cbd91540df86bdbc652bff88a847bd0d2d
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit e84126aaaf8a07bd371a0571a914ccbcd3697f20
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 110deab53dc1a2fd349b1872cd261b69074c5fa8
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit 0fa3e0c40075997ea80ed976bdee9615f17d3ece
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit 2aaca579120def99860f90054233f3358950fa66
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Add download configuration for dataset

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* add datasets

* Update gpt_eval_model_name in mathvista.yaml

* Merge commit '817eb057bcb61226b33d3ac3c8def01c36c90f96'

* Update pyproject.toml with dependencies and URLs

* Squashed commit of the following:

commit f253968ad703f682a29317bdd51ec6c1fd7c5465
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Sat Jan 27 13:56:37 2024 +0800

    Dev/add chartqa and ai2d (#23)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * Add 'submissions/' directory to .gitignore

    * Add Python setup and Black version installation workflow
    Refactor ContextSampler class in samplers.py
    Remove unnecessary line in DecontaminationFilter class
    Update dependencies in pyproject.toml

    * Refactor code in ContextSampler class

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

* Refactor image processing and submission file path

* Refactor directory creation logic in cli_evaluate_single function

* Update dataset path and test split in vqav2.yaml

* Remove "total" column from cap_details_columns DataFrame

* Add retry logic for dataset download

* Add 'tenacity' to dependencies in pyproject.toml

* Refactor code in ContextSampler class

* Update Black version and configuration, and improve code readability in ContextSampler

* Update Black version and line length

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
Co-authored-by: Fanyi Pu <FPU001@e.ntu.edu.sg>

* vqav2 (#25)

* Update tqdm progress bar position

* Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384'

* Squashed commit of the following:

commit 18e984cfe173390843c73048a931baa17800f918
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

commit ecb47d73d6e000b472be6c5c0cdc9413c7734384
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit dc23f4b42b1dd60b41904d7ddbee1412d6851077
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

    * Update dataset paths and improve user prompts

commit 95ef3ea519cbd772924f9a6afa5394979eb00432
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* remove useless output file

* Update dataset path in vqav2.yaml

* Squashed commit of the following:

commit 75bb7043ea5a533ab6351fc0f5ab055e86106423
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:56:45 2024 +0800

    Black lint

commit 6635a8aa34cfbd3c7a4afb6fcd214a7283ce01cb
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:47 2024 +0800

    Solve doc_iterator_for_counting crashing issue

commit 080f42b88ea8acacd527b8d67b84ba1d7d135b03
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:13 2024 +0800

    Exclude train in refcoco/+/g config

commit 4da84069c08c95e49e8ab0e64a1e103ff7ac8730
Merge: 6a1ae69 697a438
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:13 2024 +0000

    Merge branch 'dev/bli_add_datasets' of https://github.com/EvolvingLMMs-Lab/lmms-eval into dev/bli_add_datasets

commit 6a1ae69923d79ae32a001edac38206b605274ec3
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:06 2024 +0000

    Fix file path and raise error if config file does not exist

commit 697a4387827ceeec3e393237dd1baa217c714c88
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 00:47:24 2024 +0800

    Fix tasks issue for nocaps, refcoco/+/g

commit 47e40437126d39a5f062c9a33b4de426c1a29804
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 10:09:43 2024 +0000

    Remove unused files and update task configuration

commit 9976eb8e9ed03c8613725fdbd822ef5d8cf70e47
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:43:56 2024 +0000

    Add submission file for coco, flickr30k, nocaps, and textcaps tasks

commit 95f97a69faa6129676e89eee14960fcfe2076b7c
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:32:54 2024 +0000

    Refactor get_task_dict function to handle nested groups

commit 3b79ee842b2488714baf92ab34528ef77989d392
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:13:46 2024 +0000

    Fix bug in login functionality

    Refactor code for better performance

    Add new feature for user authentication

    Update UI layout for improved user experience

    Fix typo in variable name

    Optimize database queries for faster response time

    Add error handling for edge cases

    Update dependencies to latest versions

    Remove unused code

    Improve code readability and maintainability

commit f5c353f2ce93a2d96add4312b695b57432f68cbb
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 17:07:20 2024 +0800

    Fix cli itself can not run with config file

commit 9a68fec37be74cfe8d4a73390bc83edee147ae24
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:09:04 2024 +0000

    Squashed commit of the following:

    commit 18e984cfe173390843c73048a931baa17800f918
    Author: Zhang Peiyuan <a1286225768@gmail.com>
    Date:   Thu Jan 25 17:08:25 2024 +0800

        add model specific prompt and gen kwargs in sqa (#19)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

commit 93f847c5851fd246716367935d6b807b17d53949
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:02:57 2024 +0000

    Squashed commit of the following:

    commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit 63739fc6fa0a462d807ae81de0db0173102de584
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit edcc752f97ea3845cefad56624e5d2855066f680
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 5f55126484a7c9325db586d26cf2052538222804
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit fa4ad4404e26d8924f55208746dbb9143b464011
Merge: 22c3adf 1d3fdd4
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:43:15 2024 +0000

    Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384' into dev/bli_add_datasets

commit 22c3adfd0645acc23b6d7c06b487f4ffd47666c4
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:52 2024 +0000

    Squashed commit of the following:

    commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit 63739fc6fa0a462d807ae81de0db0173102de584
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit edcc752f97ea3845cefad56624e5d2855066f680
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit ecb47d73d6e000b472be6c5c0cdc9413c7734384
    Author: kcz358 <92624596+kcz358@users.noreply.github.com>
    Date:   Thu Jan 25 09:47:31 2024 +0800

        [Dataset] Add flickr30k (#18)

        * Add flickr30k support

        * Black lint

        * Align prompt with NoCaps

    commit dc23f4b42b1dd60b41904d7ddbee1412d6851077
    Author: Li Bo <drluodian@gmail.com>
    Date:   Wed Jan 24 22:10:14 2024 +0800

        [Datasets] modify NoCaps data path and prompts (#17)

        * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

        * Update dataset paths and improve user prompts

    commit 5f55126484a7c9325db586d26cf2052538222804
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit 4c712336b6f7438e717a865910bb241e413a4688
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:11 2024 +0000

    Add coco_val and coco_test tasks to coco.yaml

commit b5547126c855927fd4dc8384211e4aceee40870f
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 04:58:28 2024 +0000

    Update dataset_path in flickr30k.yaml

commit f786f61e2559f082072f21aa9030e2080ddaf809
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:12:25 2024 +0000

    Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384'

commit 796a011000e0df90f66f8e80cb34dc2318ae9ac8
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:10:18 2024 +0000

    Add submission folder and update file paths for storing prediction results

commit ecb47d73d6e000b472be6c5c0cdc9413c7734384
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit dc23f4b42b1dd60b41904d7ddbee1412d6851077
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

    * Update dataset paths and improve user prompts

commit 118744c63eb2d9724571d85fbbd85fcc9ad05b59
Merge: c6370bf 51f2eaa
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:07 2024 +0800

    Merge branch 'main' into dev/bli_add_datasets

commit c6370bff65903681f00cf3d07111d8e15a57b619
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 14:08:06 2024 +0000

    Update dataset paths and improve user prompts

commit 810daf458fa94cb3ec2b4a6cc5ecb1e656a24002
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 11:52:33 2024 +0000

    Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

commit 95ef3ea519cbd772924f9a6afa5394979eb00432
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* Fix bug in login functionality

* create vqav2_val

* Update vqav2_test.yaml

* Update vqav2_test.yaml

* Update vqav2_val.yaml

---------

Co-authored-by: Li Bo <drluodian@gmail.com>

* vqav2 (#25)

* Update tqdm progress bar position

* Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

* Squashed commit of the following:

commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

commit 842fbc6f2da7d9a118adf9ec27c3d8542d74168e
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4bf0504fabc3b62f356c467b2fd1119083d27313
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

    * Update dataset paths and improve user prompts

commit 767f7e2cae60cf67ec5878234d84321395a3ed15
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* remove useless output file

* Update dataset path in vqav2.yaml

* Squashed commit of the following:

commit eeb2b9827502f044ef67d8440f53124baf219ba3
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:56:45 2024 +0800

    Black lint

commit 1ce9f0b37e4bc5e6ff5fbfcd23fd339eb14974ae
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:47 2024 +0800

    Solve doc_iterator_for_counting crashing issue

commit e12b3bb41ed4f51540cfac84e5e96d15777540c4
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:13 2024 +0800

    Exclude train in refcoco/+/g config

commit 42c56f82bc4ccae12e19e76d09d7e525ca9ef2f4
Merge: 6a1ae69 697a438
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:13 2024 +0000

    Merge branch 'dev/bli_add_datasets' of https://github.com/EvolvingLMMs-Lab/lmms-eval into dev/bli_add_datasets

commit aed08303fe87808986d206540a0c0ee6d8764988
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:06 2024 +0000

    Fix file path and raise error if config file does not exist

commit a105386613c443d9e740c89725cbd1281bbdfef6
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 00:47:24 2024 +0800

    Fix tasks issue for nocaps, refcoco/+/g

commit 21c8119e377760f44c769bed2528d863a8f4333b
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 10:09:43 2024 +0000

    Remove unused files and update task configuration

commit 0ccb2629c2aacdb297b7cf0c9c2bcfa386bb7582
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:43:56 2024 +0000

    Add submission file for coco, flickr30k, nocaps, and textcaps tasks

commit 5365e13e93c702a1e0e259ee6a08d6a427d72470
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:32:54 2024 +0000

    Refactor get_task_dict function to handle nested groups

commit 6773348c807bcfa1b09ceffc90c75e15cad908f7
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:13:46 2024 +0000

    Fix bug in login functionality

    Refactor code for better performance

    Add new feature for user authentication

    Update UI layout for improved user experience

    Fix typo in variable name

    Optimize database queries for faster response time

    Add error handling for edge cases

    Update dependencies to latest versions

    Remove unused code

    Improve code readability and maintainability

commit 31140f9c87dea89ca94c94bc850e3a8d43e5f8b4
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 17:07:20 2024 +0800

    Fix cli itself can not run with config file

commit df1bad47f6ed13f94848d2bee29b28e00c2384b2
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:09:04 2024 +0000

    Squashed commit of the following:

    commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
    Author: Zhang Peiyuan <a1286225768@gmail.com>
    Date:   Thu Jan 25 17:08:25 2024 +0800

        add model specific prompt and gen kwargs in sqa (#19)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

commit 06383aa4a5ff59db52fc8d584f3086efd88b7e74
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:02:57 2024 +0000

    Squashed commit of the following:

    commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit 2626383d99b5eac59d531ca0f293df960570c524
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 520c7a2cafe60810aca79df814ce6829d4576032
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit 7a71fd6022ee5985100dda38b94956595cec77a5
Merge: 22c3adf 1d3fdd4
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:43:15 2024 +0000

    Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e' into dev/bli_add_datasets

commit 6870cba13cb54976480c1d5e8d97602c246f881b
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:52 2024 +0000

    Squashed commit of the following:

    commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit 2626383d99b5eac59d531ca0f293df960570c524
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 842fbc6f2da7d9a118adf9ec27c3d8542d74168e
    Author: kcz358 <92624596+kcz358@users.noreply.github.com>
    Date:   Thu Jan 25 09:47:31 2024 +0800

        [Dataset] Add flickr30k (#18)

        * Add flickr30k support

        * Black lint

        * Align prompt with NoCaps

    commit 4bf0504fabc3b62f356c467b2fd1119083d27313
    Author: Li Bo <drluodian@gmail.com>
    Date:   Wed Jan 24 22:10:14 2024 +0800

        [Datasets] modify NoCaps data path and prompts (#17)

        * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

        * Update dataset paths and improve user prompts

    commit 520c7a2cafe60810aca79df814ce6829d4576032
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit b40d522b6bf483ebdfbf5facd4573de0cf8a93f6
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:11 2024 +0000

    Add coco_val and coco_test tasks to coco.yaml

commit 5bf643f73d06f1e540897b753450352bb92fd9ec
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 04:58:28 2024 +0000

    Update dataset_path in flickr30k.yaml

commit 95f110f0eef5196205bc501367e3642c57cc7a17
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:12:25 2024 +0000

    Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

commit c844ae49b18c1334711832208b0359c9439fe1c0
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:10:18 2024 +0000

    Add submission folder and update file paths for storing prediction results

commit 842fbc6f2da7d9a118adf9ec27c3d8542d74168e
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4bf0504fabc3b62f356c467b2fd1119083d27313
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

    * Update dataset paths and improve user prompts

commit f0446227f0dd93651e9d6c06254bbf5212ede2dd
Merge: c6370bf 51f2eaa
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:07 2024 +0800

    Merge branch 'main' into dev/bli_add_datasets

commit 1e1f6cfccba758dc606fa4217102518fab73c936
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 14:08:06 2024 +0000

    Update dataset paths and improve user prompts

commit 966933754b9e5179995b3ab41d746603e13e75c6
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 11:52:33 2024 +0000

    Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

commit 767f7e2cae60cf67ec5878234d84321395a3ed15
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* Fix bug in login functionality

* create vqav2_val

* Update vqav2_test.yaml

* Update vqav2_test.yaml

* Update vqav2_val.yaml

---------

Co-authored-by: Li Bo <drluodian@gmail.com>

* vizwiz dataset (#24)

* Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts i…
Luodian added a commit that referenced this issue Jun 12, 2024
* Add COCO, RefCOCO, RefCOCO+, RefCOCOg (#5)

* Update author name and email in pyproject.toml

* add mmvet and try to modify llava arch

* Add coco, refcoco support

* Fix doc_to_visual error

* Fix segmentation mask error

* Add refcoco+, refcocog

* Remove debug code

* black lint

* Remove unused code and scripts

* Fix group stderr N/A error between str and int

* Fix letter case issue

* Update lmms_eval tasks and utils

* Fix coco test_split name

* Add llava-bench-in-the-wild support

* Black codestyle, lint

* Add COCO evaluation metric

* Add refcoco, refcocog, refcoco+ evaluation kit

* Add llava bench coco support

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* VQAv2 eval (#4)

* vqav2

* Add vqav2_process_results function and update vqav2_doc_to_text function

* Implement vqav2_process_results function to return exact match score

* Refactor fewshot_docs() to use config.fewshot_config

* Refactor Task class to handle fewshot_docs when training and validation docs are not available

* Add answer processing logic in vqav2_process_results function

* Refactor vqav2_process_results function and add submission aggregation

* Add vqav2_aggreate_submissions function to utils.py

* textvqa

* Refactor answer processing in textvqa_process_results() function

* textvqa eval

* Update dataset path and modify textvqa_doc_to_text function

* Capitalize the question in textvqa_doc_to_text function

* Update textvqa.yaml and utils.py

* Fix formatting issues in lmms_eval/api/task.py, lmms_eval/tasks/gqa/utils.py, lmms_eval/tasks/textvqa/utils.py, and lmms_eval/tasks/vqav2/utils.py

---------

Co-authored-by: Li Bo <drluodian@gmail.com>

* [Big Changes] add LLaVA-1.6, MMVet, LLaVA-W, POPE, and many other changes on logs, model args. (#7)

* Update author name and email in pyproject.toml

* add mmvet and try to modify llava arch

* black lint

* Remove unused code and scripts

* Update lmms_eval tasks and utils

* Update LMMS-Eval dependencies and configurations

* Squashed commit of the following:

commit 209f3904f33210bec0b4b146e96fcbd67a4e1541
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Wed Jan 17 20:27:13 2024 +0800

    Add COCO, RefCOCO, RefCOCO+, RefCOCOg (#5)

    * Update author name and email in pyproject.toml

    * add mmvet and try to modify llava arch

    * Add coco, refcoco support

    * Fix doc_to_visual error

    * Fix segmentation mask error

    * Add refcoco+, refcocog

    * Remove debug code

    * black lint

    * Remove unused code and scripts

    * Fix group stderr N/A error between str and int

    * Fix letter case issue

    * Update lmms_eval tasks and utils

    * Fix coco test_split name

    * Add llava-bench-in-the-wild support

    * Black codestyle, lint

    * Add COCO evaluation metric

    * Add refcoco, refcocog, refcoco+ evaluation kit

    * Add llava bench coco support

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

commit f9e48cec5493010a363b446b81a335ef1484e42f
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Wed Jan 17 20:26:58 2024 +0800

    Update utils.py (#6)

* Fix logging issue and remove unnecessary whitespace

* Add openai and pycocoevalcap dependencies

* Fix device mapping issue in Llava constructor

* Add support for truncating context in generation

* Update Llava model and evaluation configuration

* Update YAML configuration files

* Update YAML configuration files

* add otterhd and gemini models

* Add support for custom image aspect ratio in Llava model

* Add dataset_kwargs and max_gen_toks to YAML files

* Fix log_samples suffix typo and use hash for output name

* Refactor LMMS evaluation code and update LLAVA model properties

* matched response for mistral-llava

* Refactor logging in llava_aggregation function

* Print evaluation statistics instead of logging them

* Fix logging information in llava_aggregation function

* Add new models and dataset_kwargs for COCO tasks

* Update truncate_context parameter in Llava class constructor

* Update dataset_kwargs in YAML files

* Remove issue type tags from issue and pull request templates

* add mmvet and try to modify llava arch

* black lint

* Update lmms_eval tasks and utils

* Update LMMS-Eval dependencies and configurations

* Squashed commit of the following:

commit 209f3904f33210bec0b4b146e96fcbd67a4e1541
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Wed Jan 17 20:27:13 2024 +0800

    Add COCO, RefCOCO, RefCOCO+, RefCOCOg (#5)

    * Update author name and email in pyproject.toml

    * add mmvet and try to modify llava arch

    * Add coco, refcoco support

    * Fix doc_to_visual error

    * Fix segmentation mask error

    * Add refcoco+, refcocog

    * Remove debug code

    * black lint

    * Remove unused code and scripts

    * Fix group stderr N/A error between str and int

    * Fix letter case issue

    * Update lmms_eval tasks and utils

    * Fix coco test_split name

    * Add llava-bench-in-the-wild support

    * Black codestyle, lint

    * Add COCO evaluation metric

    * Add refcoco, refcocog, refcoco+ evaluation kit

    * Add llava bench coco support

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

commit f9e48cec5493010a363b446b81a335ef1484e42f
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Wed Jan 17 20:26:58 2024 +0800

    Update utils.py (#6)

* Fix logging issue and remove unnecessary whitespace

* Add openai and pycocoevalcap dependencies

* Fix device mapping issue in Llava constructor

* Add support for truncating context in generation

* Update Llava model and evaluation configuration

* Update YAML configuration files

* Update YAML configuration files

* add otterhd and gemini models

* Add support for custom image aspect ratio in Llava model

* Add dataset_kwargs and max_gen_toks to YAML files

* Fix log_samples suffix typo and use hash for output name

* Refactor LMMS evaluation code and update LLAVA model properties

* matched response for mistral-llava

* Refactor logging in llava_aggregation function

* Print evaluation statistics instead of logging them

* Fix logging information in llava_aggregation function

* Add new models and dataset_kwargs for COCO tasks

* Update truncate_context parameter in Llava class constructor

* Update dataset_kwargs in YAML files

* Remove issue type tags from issue and pull request templates

* Refactor pope utils functions

* Update transformers dependency to version 4.36.2

* Revise llava-in-the-wild prompt for align

* Add default values for gen_kwargs in Llava class

* Fix formatting issues and import pdb for debugging

* Remove pdb.set_trace() and update default value for max_new_tokens

* Add llava loglikelihood

* Fix formatting and indentation issues in lmms_eval/api/metrics.py and lmms_eval/models/llava.py

* Update function to handle edge cases

This commit updates the function to handle edge cases, improving the overall reliability and robustness of the code.

* Update black version in pre-commit config

* Remove duplicate lines in gqa

* Another way to solve memory issue

* Handle exception in model generation

* Refactor pope_aggregate_results to use "score" key instead of "pope_accuracy"

* Update pope metrics aggregation functions

* Add model_to_prompt in pope.yaml

* Update pope.yaml configuration

* Refactor code to simplify construct_requests call

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>

* Add datetime to output name in cli_evaluate function

Add get_datetime_str function to utils.py

* Refactor pope_aggregate_f1_score function

* Fix datetime format in get_datetime_str function

* Update JSON dump indentation in cli_evaluate function

* Add datetime to output name in cli_evaluate function (#10)

* Revert "Add datetime to output name in cli_evaluate function"

This reverts commit ef26f78c46b50d8769a4fb6990b909162c2881c3.

* Add datetime to output name in cli_evaluate function

* [Datasets] Added POPE and Aligned. (#11)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* [Dataset] Add SEED-Bench, TextCaps, NoCaps (#12)

* Change coco from print to logger

* Add llava loglikelihood

* Add Nocaps support

* Fix pass through function

* Add textcaps support

* Fix textcaps eval image_id

* Add seedbench support

* Add seedbench ppl evaluation

* black lint

* [Datasets] Add four internal evaluation datasets (#13)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* Remove unused variable in mmvet_process_results function

* Remove unused imports in utils.py

* Refactor get_chat_response function to include retries for API requests

* Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

* Update prompt variable in lmms_eval tasks

* Refactor output_name variable in cli_evaluate function

* Fix logging message in mmvet_process_results function

* Update sleep time in get_chat_response function

* Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

* Refactor get_eval function to include retries

* Add token parameter to load_dataset function in gqa_doc_to_visual

* Refactor llava_process_results and llava_aggregation functions

* [Datasets] Add four internal evaluation datasets (#13)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* Remove unused variable in mmvet_process_results function

* Remove unused imports in utils.py

* Refactor get_chat_response function to include retries for API requests

* Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

* Update prompt variable in lmms_eval tasks

* Refactor output_name variable in cli_evaluate function

* Fix logging message in mmvet_process_results function

* Update sleep time in get_chat_response function

* Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

* Refactor get_eval function to include retries

* Add token parameter to load_dataset function in gqa_doc_to_visual

* Refactor llava_process_results and llava_aggregation functions

* add mmmu (#15)

* add mmme

* black

* add mmmu (#15)

* add mmme

* black

* [Memory issue] Solve memory issue for building context (#14)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* Remove unused variable in mmvet_process_results function

* Remove unused imports in utils.py

* Refactor get_chat_response function to include retries for API requests

* Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

* Update prompt variable in lmms_eval tasks

* Refactor output_name variable in cli_evaluate function

* Fix logging message in mmvet_process_results function

* Update sleep time in get_chat_response function

* Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

* Refactor get_eval function to include retries

* Add token parameter to load_dataset function in gqa_doc_to_visual

* Refactor llava_process_results and llava_aggregation functions

* Remove unused function llava_aggregation

* Refractor llava-bench aggregation code

* Add logs and scripts to .gitignore, and set image_aspect_ratio to original in scienceqa.yaml

* Update generation parameters in scienceqa.yaml

* Solve memory issue for building context

* Solved gather result error

* Update lmms_eval scienceqa_img config

* Fixed nocaps store results

* Revise seedbench prompt

* Squashed commit of the following:

commit 290126e6a269db4cca9b3544bd017d6c17012793
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Wed Jan 24 14:07:36 2024 +0800

    add mmmu (#15)

    * add mmme

    * black

commit 8b0227cd7b2602d096d773a01b2199d1f4110f22
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 10:00:33 2024 +0800

    [Datasets] Add four internal evaluation datasets (#13)

    * Update generation_kwargs in pope.yaml

    * Update pope_doc_to_text function

    * Remove unused variable in mmvet_process_results function

    * Remove unused imports in utils.py

    * Refactor get_chat_response function to include retries for API requests

    * Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

    * Update prompt variable in lmms_eval tasks

    * Refactor output_name variable in cli_evaluate function

    * Fix logging message in mmvet_process_results function

    * Update sleep time in get_chat_response function

    * Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

    * Refactor get_eval function to include retries

    * Add token parameter to load_dataset function in gqa_doc_to_visual

    * Refactor llava_process_results and llava_aggregation functions

commit fec494dbe5971e8fa5a886b191a4781be3ce7a6f
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Tue Jan 23 19:17:40 2024 +0800

    [Dataset] Add SEED-Bench, TextCaps, NoCaps (#12)

    * Change coco from print to logger

    * Add llava loglikelihood

    * Add Nocaps support

    * Fix pass through function

    * Add textcaps support

    * Fix textcaps eval image_id

    * Add seedbench support

    * Add seedbench ppl evaluation

    * black lint

commit 4c3c2c63a681f29c537c2467957de1a90568748d
Author: Li Bo <drluodian@gmail.com>
Date:   Tue Jan 23 19:17:12 2024 +0800

    [Datasets] Added POPE and Aligned. (#11)

    * Update generation_kwargs in pope.yaml

    * Update pope_doc_to_text function

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* [Memory issue] Solve memory issue for building context (#14)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* Remove unused variable in mmvet_process_results function

* Remove unused imports in utils.py

* Refactor get_chat_response function to include retries for API requests

* Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

* Update prompt variable in lmms_eval tasks

* Refactor output_name variable in cli_evaluate function

* Fix logging message in mmvet_process_results function

* Update sleep time in get_chat_response function

* Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

* Refactor get_eval function to include retries

* Add token parameter to load_dataset function in gqa_doc_to_visual

* Refactor llava_process_results and llava_aggregation functions

* Remove unused function llava_aggregation

* Refractor llava-bench aggregation code

* Add logs and scripts to .gitignore, and set image_aspect_ratio to original in scienceqa.yaml

* Update generation parameters in scienceqa.yaml

* Solve memory issue for building context

* Solved gather result error

* Update lmms_eval scienceqa_img config

* Fixed nocaps store results

* Revise seedbench prompt

* Squashed commit of the following:

commit c3cc24a89415aeccad31ccbb10642af677cd6fe5
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Wed Jan 24 14:07:36 2024 +0800

    add mmmu (#15)

    * add mmme

    * black

commit 0dbc5d16c4f45ebea8def5f0bc1a36fcd93f9a05
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 10:00:33 2024 +0800

    [Datasets] Add four internal evaluation datasets (#13)

    * Update generation_kwargs in pope.yaml

    * Update pope_doc_to_text function

    * Remove unused variable in mmvet_process_results function

    * Remove unused imports in utils.py

    * Refactor get_chat_response function to include retries for API requests

    * Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

    * Update prompt variable in lmms_eval tasks

    * Refactor output_name variable in cli_evaluate function

    * Fix logging message in mmvet_process_results function

    * Update sleep time in get_chat_response function

    * Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

    * Refactor get_eval function to include retries

    * Add token parameter to load_dataset function in gqa_doc_to_visual

    * Refactor llava_process_results and llava_aggregation functions

commit fec494dbe5971e8fa5a886b191a4781be3ce7a6f
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Tue Jan 23 19:17:40 2024 +0800

    [Dataset] Add SEED-Bench, TextCaps, NoCaps (#12)

    * Change coco from print to logger

    * Add llava loglikelihood

    * Add Nocaps support

    * Fix pass through function

    * Add textcaps support

    * Fix textcaps eval image_id

    * Add seedbench support

    * Add seedbench ppl evaluation

    * black lint

commit 4c3c2c63a681f29c537c2467957de1a90568748d
Author: Li Bo <drluodian@gmail.com>
Date:   Tue Jan 23 19:17:12 2024 +0800

    [Datasets] Added POPE and Aligned. (#11)

    * Update generation_kwargs in pope.yaml

    * Update pope_doc_to_text function

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* Add output path file naming convention (#16)

Update datetime format in get_datetime_str() function

* Add output path file naming convention (#16)

Update datetime format in get_datetime_str() function

* [Datasets] modify NoCaps data path and prompts (#17)

* Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

* Update dataset paths and improve user prompts

* [Datasets] modify NoCaps data path and prompts (#17)

* Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

* Update dataset paths and improve user prompts

* [Dataset] Add flickr30k (#18)

* Add flickr30k support

* Black lint

* Align prompt with NoCaps

* [Dataset] Add flickr30k (#18)

* Add flickr30k support

* Black lint

* Align prompt with NoCaps

* add model specific prompt and gen kwargs in sqa (#19)

* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* add model specific prompt and gen kwargs in sqa (#19)

* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* Dev/add chartqa and ai2d (#23)

* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* add chartqa

* black

* add ai2d

* black

* update chartqa

* blacl

* update ai2d dataset

* black

* Add 'submissions/' directory to .gitignore

* Add Python setup and Black version installation workflow
Refactor ContextSampler class in samplers.py
Remove unnecessary line in DecontaminationFilter class
Update dependencies in pyproject.toml

* Refactor code in ContextSampler class

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* Dev/add chartqa and ai2d (#23)

* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* add chartqa

* black

* add ai2d

* black

* update chartqa

* blacl

* update ai2d dataset

* black

* Add 'submissions/' directory to .gitignore

* Add Python setup and Black version installation workflow
Refactor ContextSampler class in samplers.py
Remove unnecessary line in DecontaminationFilter class
Update dependencies in pyproject.toml

* Refactor code in ContextSampler class

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* [Datasets] Changes for Flickr30K and NoCaps, also merged Peiyuan's Model Specific Prompt. (#20)

* Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 2626383d99b5eac59d531ca0f293df960570c524
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 842fbc6f2da7d9a118adf9ec27c3d8542d74168e
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4bf0504fabc3b62f356c467b2fd1119083d27313
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

    * Update dataset paths and improve user prompts

commit 520c7a2cafe60810aca79df814ce6829d4576032
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 2626383d99b5eac59d531ca0f293df960570c524
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 520c7a2cafe60810aca79df814ce6829d4576032
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit a2cc9303dc72e4d53983bb56e54a32e977c3e270
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit 35e87e7c7a480d005abf607c2527a35457d92311
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 89755323596b85208ed33aa88c296604a39af6eb
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 0b0d30dfb247c5f0b7b68398b9e9fcde74cf7fa2
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit e273f9cbd91540df86bdbc652bff88a847bd0d2d
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit e84126aaaf8a07bd371a0571a914ccbcd3697f20
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 110deab53dc1a2fd349b1872cd261b69074c5fa8
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit 0fa3e0c40075997ea80ed976bdee9615f17d3ece
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit 2aaca579120def99860f90054233f3358950fa66
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Add download configuration for dataset

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* add datasets

* Update gpt_eval_model_name in mathvista.yaml

* Merge commit '817eb057bcb61226b33d3ac3c8def01c36c90f96'

* Update pyproject.toml with dependencies and URLs

* Squashed commit of the following:

commit f253968ad703f682a29317bdd51ec6c1fd7c5465
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Sat Jan 27 13:56:37 2024 +0800

    Dev/add chartqa and ai2d (#23)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * Add 'submissions/' directory to .gitignore

    * Add Python setup and Black version installation workflow
    Refactor ContextSampler class in samplers.py
    Remove unnecessary line in DecontaminationFilter class
    Update dependencies in pyproject.toml

    * Refactor code in ContextSampler class

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

* Refactor image processing and submission file path

* Refactor directory creation logic in cli_evaluate_single function

* Update dataset path and test split in vqav2.yaml

* Remove "total" column from cap_details_columns DataFrame

* Add retry logic for dataset download

* Add 'tenacity' to dependencies in pyproject.toml

* Refactor code in ContextSampler class

* Update Black version and configuration, and improve code readability in ContextSampler

* Update Black version and line length

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
Co-authored-by: Fanyi Pu <FPU001@e.ntu.edu.sg>

* [Datasets] Changes for Flickr30K and NoCaps, also merged Peiyuan's Model Specific Prompt. (#20)

* Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit ecb47d73d6e000b472be6c5c0cdc9413c7734384
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit dc23f4b42b1dd60b41904d7ddbee1412d6851077
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

    * Update dataset paths and improve user prompts

commit 5f55126484a7c9325db586d26cf2052538222804
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 5f55126484a7c9325db586d26cf2052538222804
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 18e984cfe173390843c73048a931baa17800f918
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit 0df825c9e72a06e6acb4c0bd43c2083ffe8b74c0
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit b9d9f9896993033b92346e9f47420c55b866c715
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 4256bef410e4c8d8761e0cd0d79ac5e57b97651b
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit 18e984cfe173390843c73048a931baa17800f918
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 0c8a3919885b8fe2880bb2892f7a619d060012d1
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit d2bc7c92ac61179b8c4031e11bc31970355252f6
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit c78fa29cd0d161641ee05db57bd39314b998c8c7
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 397f0906968fd8ba04b883469b96217737c43e09
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit 52a7ea6c7599adeec2ac2787f500e215ce47cf79
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit f706b2aaf9b288c582611191a1841b58feaeb741
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit 18e984cfe173390843c73048a931baa17800f918
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Add download configuration for dataset

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* add datasets

* Update gpt_eval_model_name in mathvista.yaml

* Merge commit '0d620f98b49f8204d02633f209eedd5d8b7a1f7c'

* Update pyproject.toml with dependencies and URLs

* Squashed commit of the following:

commit 8b600f55b6cf5627504c407871539db59f6085a3
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Sat Jan 27 13:56:37 2024 +0800

    Dev/add chartqa and ai2d (#23)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * Add 'submissions/' directory to .gitignore

    * Add Python setup and Black version installation workflow
    Refactor ContextSampler class in samplers.py
    Remove unnecessary line in DecontaminationFilter class
    Update dependencies in pyproject.toml

    * Refactor code in ContextSampler class

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

* Refactor image processing and submission file path

* Refactor directory creation logic in cli_evaluate_single function

* Update dataset path and test split in vqav2.yaml

* Remove "total" column from cap_details_columns DataFrame

* Add retry logic for dataset download

* Add 'tenacity' to dependencies in pyproject.toml

* Refactor code in ContextSampler class

* Update Black version and configuration, and improve code readability in ContextSampler

* Update Black version and line length

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
Co-authored-by: Fanyi Pu <FPU001@e.ntu.edu.sg>

* vqav2 (#25)

* Update tqdm progress bar position

* Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

* Squashed commit of the following:

commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

commit 842fbc6f2da7d9a118adf9ec27c3d8542d74168e
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4bf0504fabc3b62f356c467b2fd1119083d27313
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

    * Update dataset paths and improve user prompts

commit 767f7e2cae60cf67ec5878234d84321395a3ed15
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* remove useless output file

* Update dataset path in vqav2.yaml

* Squashed commit of the following:

commit eeb2b9827502f044ef67d8440f53124baf219ba3
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:56:45 2024 +0800

    Black lint

commit 1ce9f0b37e4bc5e6ff5fbfcd23fd339eb14974ae
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:47 2024 +0800

    Solve doc_iterator_for_counting crashing issue

commit e12b3bb41ed4f51540cfac84e5e96d15777540c4
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:13 2024 +0800

    Exclude train in refcoco/+/g config

commit 42c56f82bc4ccae12e19e76d09d7e525ca9ef2f4
Merge: 6a1ae69 697a438
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:13 2024 +0000

    Merge branch 'dev/bli_add_datasets' of https://github.com/EvolvingLMMs-Lab/lmms-eval into dev/bli_add_datasets

commit aed08303fe87808986d206540a0c0ee6d8764988
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:06 2024 +0000

    Fix file path and raise error if config file does not exist

commit a105386613c443d9e740c89725cbd1281bbdfef6
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 00:47:24 2024 +0800

    Fix tasks issue for nocaps, refcoco/+/g

commit 21c8119e377760f44c769bed2528d863a8f4333b
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 10:09:43 2024 +0000

    Remove unused files and update task configuration

commit 0ccb2629c2aacdb297b7cf0c9c2bcfa386bb7582
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:43:56 2024 +0000

    Add submission file for coco, flickr30k, nocaps, and textcaps tasks

commit 5365e13e93c702a1e0e259ee6a08d6a427d72470
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:32:54 2024 +0000

    Refactor get_task_dict function to handle nested groups

commit 6773348c807bcfa1b09ceffc90c75e15cad908f7
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:13:46 2024 +0000

    Fix bug in login functionality

    Refactor code for better performance

    Add new feature for user authentication

    Update UI layout for improved user experience

    Fix typo in variable name

    Optimize database queries for faster response time

    Add error handling for edge cases

    Update dependencies to latest versions

    Remove unused code

    Improve code readability and maintainability

commit 31140f9c87dea89ca94c94bc850e3a8d43e5f8b4
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 17:07:20 2024 +0800

    Fix cli itself can not run with config file

commit df1bad47f6ed13f94848d2bee29b28e00c2384b2
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:09:04 2024 +0000

    Squashed commit of the following:

    commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
    Author: Zhang Peiyuan <a1286225768@gmail.com>
    Date:   Thu Jan 25 17:08:25 2024 +0800

        add model specific prompt and gen kwargs in sqa (#19)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

commit 06383aa4a5ff59db52fc8d584f3086efd88b7e74
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:02:57 2024 +0000

    Squashed commit of the following:

    commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit 2626383d99b5eac59d531ca0f293df960570c524
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 520c7a2cafe60810aca79df814ce6829d4576032
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit 7a71fd6022ee5985100dda38b94956595cec77a5
Merge: 22c3adf 1d3fdd4
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:43:15 2024 +0000

    Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e' into dev/bli_add_datasets

commit 6870cba13cb54976480c1d5e8d97602c246f881b
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:52 2024 +0000

    Squashed commit of the following:

    commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit 2626383d99b5eac59d531ca0f293df960570c524
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 842fbc6f2da7d9a118adf9ec27c3d8542d74168e
    Author: kcz358 <92624596+kcz358@users.noreply.github.com>
    Date:   Thu Jan 25 09:47:31 2024 +0800

        [Dataset] Add flickr30k (#18)

        * Add flickr30k support

        * Black lint

        * Align prompt with NoCaps

    commit 4bf0504fabc3b62f356c467b2fd1119083d27313
    Author: Li Bo <drluodian@gmail.com>
    Date:   Wed Jan 24 22:10:14 2024 +0800

        [Datasets] modify NoCaps data path and prompts (#17)

        * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

        * Update dataset paths and improve user prompts

    commit 520c7a2cafe60810aca79df814ce6829d4576032
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit b40d522b6bf483ebdfbf5facd4573de0cf8a93f6
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:11 2024 +0000

    Add coco_val and coco_test tasks to coco.yaml

commit 5bf643f73d06f1e540897b753450352bb92fd9ec
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 04:58:28 2024 +0000

    Update dataset_path in flickr30k.yaml

commit 95f110f0eef5196205bc501367e3642c57cc7a17
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:12:25 2024 +0000

    Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

commit c844ae49b18c1334711832208b0359c9439fe1c0
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:10:18 2024 +0000

    Add submission folder and update file paths for storing prediction results

commit 842fbc6f2da7d9a118adf9ec27c3d8542d74168e
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4bf0504fabc3b62f356c467b2fd1119083d27313
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

    * Update dataset paths and improve user prompts

commit f0446227f0dd93651e9d6c06254bbf5212ede2dd
Merge: c6370bf 51f2eaa
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:07 2024 +0800

    Merge branch 'main' into dev/bli_add_datasets

commit 1e1f6cfccba758dc606fa4217102518fab73c936
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 14:08:06 2024 +0000

    Update dataset paths and improve user prompts

commit 966933754b9e5179995b3ab41d746603e13e75c6
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 11:52:33 2024 +0000

    Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

commit 767f7e2cae60cf67ec5878234d84321395a3ed15
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* Fix bug in login functionality

* create vqav2_val

* Update vqav2_test.yaml

* Update vqav2_test.yaml

* Update vqav2_val.yaml

---------

Co-authored-by: Li Bo <drluodian@gmail.com>

* vqav2 (#25)

* Update tqdm progress bar position

* Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384'

* Squashed commit of the following:

commit 18e984cfe173390843c73048a931baa17800f918
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

commit ecb47d73d6e000b472be6c5c0cdc9413c7734384
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit dc23f4b42b1dd60b41904d7ddbee1412d6851077
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

    * Update dataset paths and improve user prompts

commit 95ef3ea519cbd772924f9a6afa5394979eb00432
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* remove useless output file

* Update dataset path in vqav2.yaml

* Squashed commit of the following:

commit 75bb7043ea5a533ab6351fc0f5ab055e86106423
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:56:45 2024 +0800

    Black lint

commit 6635a8aa34cfbd3c7a4afb6fcd214a7283ce01cb
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:47 2024 +0800

    Solve doc_iterator_for_counting crashing issue

commit 080f42b88ea8acacd527b8d67b84ba1d7d135b03
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:13 2024 +0800

    Exclude train in refcoco/+/g config

commit 4da84069c08c95e49e8ab0e64a1e103ff7ac8730
Merge: 6a1ae69 697a438
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:13 2024 +0000

    Merge branch 'dev/bli_add_datasets' of https://github.com/EvolvingLMMs-Lab/lmms-eval into dev/bli_add_datasets

commit 6a1ae69923d79ae32a001edac38206b605274ec3
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:06 2024 +0000

    Fix file path and raise error if config file does not exist

commit 697a4387827ceeec3e393237dd1baa217c714c88
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 00:47:24 2024 +0800

    Fix tasks issue for nocaps, refcoco/+/g

commit 47e40437126d39a5f062c9a33b4de426c1a29804
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 10:09:43 2024 +0000

    Remove unused files and update task configuration

commit 9976eb8e9ed03c8613725fdbd822ef5d8cf70e47
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:43:56 2024 +0000

    Add submission file for coco, flickr30k, nocaps, and textcaps tasks

commit 95f97a69faa6129676e89eee14960fcfe2076b7c
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:32:54 2024 +0000

    Refactor get_task_dict function to handle nested groups

commit 3b79ee842b2488714baf92ab34528ef77989d392
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:13:46 2024 +0000

    Fix bug in login functionality

    Refactor code for better performance

    Add new feature for user authentication

    Update UI layout for improved user experience

    Fix typo in variable name

    Optimize database queries for faster response time

    Add error handling for edge cases

    Update dependencies to latest versions

    Remove unused code

    Improve code readability and maintainability

commit f5c353f2ce93a2d96add4312b695b57432f68cbb
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 17:07:20 2024 +0800

    Fix cli itself can not run with config file

commit 9a68fec37be74cfe8d4a73390bc83edee147ae24
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:09:04 2024 +0000

    Squashed commit of the following:

    commit 18e984cfe173390843c73048a931baa17800f918
    Author: Zhang Peiyuan <a1286225768@gmail.com>
    Date:   Thu Jan 25 17:08:25 2024 +0800

        add model specific prompt and gen kwargs in sqa (#19)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

commit 93f847c5851fd246716367935d6b807b17d53949
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:02:57 2024 +0000

    Squashed commit of the following:

    commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit 63739fc6fa0a462d807ae81de0db0173102de584
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit edcc752f97ea3845cefad56624e5d2855066f680
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 5f55126484a7c9325db586d26cf2052538222804
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit fa4ad4404e26d8924f55208746dbb9143b464011
Merge: 22c3adf 1d3fdd4
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:43:15 2024 +0000

    Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384' into dev/bli_add_datasets

commit 22c3adfd0645acc23b6d7c06b487f4ffd47666c4
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:52 2024 +0000

    Squashed commit of the following:

    commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit 63739fc6fa0a462d807ae81de0db0173102de584
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit edcc752f97ea3845cefad56624e5d2855066f680
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit ecb47d73d6e000b472be6c5c0cdc9413c7734384
    Author: kcz358 <92624596+kcz358@users.noreply.github.com>
    Date:   Thu Jan 25 09:47:31 2024 +0800

        [Dataset] Add flickr30k (#18)

        * Add flickr30k support

        * Black lint

        * Align prompt with NoCaps

    commit dc23f4b42b1dd60b41904d7ddbee1412d6851077
    Author: Li Bo <drluodian@gmail.com>
    Date:   Wed Jan 24 22:10:14 2024 +0800

        [Datasets] modify NoCaps data path and prompts (#17)

        * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

        * Update dataset paths and improve user prompts

    commit 5f55126484a7c9325db586d26cf2052538222804
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit 4c712336b6f7438e717a865910bb241e413a4688
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:11 2024 +0000

    Add coco_val and coco_test tasks to coco.yaml

commit b5547126c855927fd4dc8384211e4aceee40870f
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 04:58:28 2024 +0000

    Update dataset_path in flickr30k.yaml

commit f786f61e2559f082072f21aa9030e2080ddaf809
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:12:25 2024 +0000

    Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384'

commit 796a011000e0df90f66f8e80cb34dc2318ae9ac8
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:10:18 2024 +0000

    Add submission folder and update file paths for storing prediction results

commit ecb47d73d6e000b472be6c5c0cdc9413c7734384
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit dc23f4b42b1dd60b41904d7ddbee1412d6851077
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

    * Update dataset paths and improve user prompts

commit 118744c63eb2d9724571d85fbbd85fcc9ad05b59
Merge: c6370bf 51f2eaa
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:07 2024 +0800

    Merge branch 'main' into dev/bli_add_datasets

commit c6370bff65903681f00cf3d07111d8e15a57b619
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 14:08:06 2024 +0000

    Update dataset paths and improve user prompts

commit 810daf458fa94cb3ec2b4a6cc5ecb1e656a24002
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 11:52:33 2024 +0000

    Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

commit 95ef3ea519cbd772924f9a6afa5394979eb00432
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* Fix bug in login functionality

* create vqav2_val

* Update vqav2_test.yaml

* Update vqav2_test.yaml

* Update vqav2_val.yaml

---------

Co-authored-by: Li Bo <drluodian@gmail.com>

* vizwiz dataset (#24)

* Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 2626383d99b5eac59d531ca0f293df960570c524
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit d4e8e2552d407…
Luodian added a commit that referenced this issue Jun 12, 2024
* Add COCO, RefCOCO, RefCOCO+, RefCOCOg (#5)

* Update author name and email in pyproject.toml

* add mmvet and try to modify llava arch

* Add coco, refcoco support

* Fix doc_to_visual error

* Fix segmentation mask error

* Add refcoco+, refcocog

* Remove debug code

* black lint

* Remove unused code and scripts

* Fix group stderr N/A error between str and int

* Fix letter case issue

* Update lmms_eval tasks and utils

* Fix coco test_split name

* Add llava-bench-in-the-wild support

* Black codestyle, lint

* Add COCO evaluation metric

* Add refcoco, refcocog, refcoco+ evaluation kit

* Add llava bench coco support

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* VQAv2 eval (#4)

* vqav2

* Add vqav2_process_results function and update vqav2_doc_to_text function

* Implement vqav2_process_results function to return exact match score

* Refactor fewshot_docs() to use config.fewshot_config

* Refactor Task class to handle fewshot_docs when training and validation docs are not available

* Add answer processing logic in vqav2_process_results function

* Refactor vqav2_process_results function and add submission aggregation

* Add vqav2_aggreate_submissions function to utils.py

* textvqa

* Refactor answer processing in textvqa_process_results() function

* textvqa eval

* Update dataset path and modify textvqa_doc_to_text function

* Capitalize the question in textvqa_doc_to_text function

* Update textvqa.yaml and utils.py

* Fix formatting issues in lmms_eval/api/task.py, lmms_eval/tasks/gqa/utils.py, lmms_eval/tasks/textvqa/utils.py, and lmms_eval/tasks/vqav2/utils.py

---------

Co-authored-by: Li Bo <drluodian@gmail.com>

* [Big Changes] add LLaVA-1.6, MMVet, LLaVA-W, POPE, and many other changes on logs, model args. (#7)

* Update author name and email in pyproject.toml

* add mmvet and try to modify llava arch

* black lint

* Remove unused code and scripts

* Update lmms_eval tasks and utils

* Update LMMS-Eval dependencies and configurations

* Squashed commit of the following:

commit 209f3904f33210bec0b4b146e96fcbd67a4e1541
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Wed Jan 17 20:27:13 2024 +0800

    Add COCO, RefCOCO, RefCOCO+, RefCOCOg (#5)

    * Update author name and email in pyproject.toml

    * add mmvet and try to modify llava arch

    * Add coco, refcoco support

    * Fix doc_to_visual error

    * Fix segmentation mask error

    * Add refcoco+, refcocog

    * Remove debug code

    * black lint

    * Remove unused code and scripts

    * Fix group stderr N/A error between str and int

    * Fix letter case issue

    * Update lmms_eval tasks and utils

    * Fix coco test_split name

    * Add llava-bench-in-the-wild support

    * Black codestyle, lint

    * Add COCO evaluation metric

    * Add refcoco, refcocog, refcoco+ evaluation kit

    * Add llava bench coco support

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

commit f9e48cec5493010a363b446b81a335ef1484e42f
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Wed Jan 17 20:26:58 2024 +0800

    Update utils.py (#6)

* Fix logging issue and remove unnecessary whitespace

* Add openai and pycocoevalcap dependencies

* Fix device mapping issue in Llava constructor

* Add support for truncating context in generation

* Update Llava model and evaluation configuration

* Update YAML configuration files

* Update YAML configuration files

* add otterhd and gemini models

* Add support for custom image aspect ratio in Llava model

* Add dataset_kwargs and max_gen_toks to YAML files

* Fix log_samples suffix typo and use hash for output name

* Refactor LMMS evaluation code and update LLAVA model properties

* matched response for mistral-llava

* Refactor logging in llava_aggregation function

* Print evaluation statistics instead of logging them

* Fix logging information in llava_aggregation function

* Add new models and dataset_kwargs for COCO tasks

* Update truncate_context parameter in Llava class constructor

* Update dataset_kwargs in YAML files

* Remove issue type tags from issue and pull request templates

* add mmvet and try to modify llava arch

* black lint

* Update lmms_eval tasks and utils

* Update LMMS-Eval dependencies and configurations

* Squashed commit of the following:

commit 209f3904f33210bec0b4b146e96fcbd67a4e1541
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Wed Jan 17 20:27:13 2024 +0800

    Add COCO, RefCOCO, RefCOCO+, RefCOCOg (#5)

    * Update author name and email in pyproject.toml

    * add mmvet and try to modify llava arch

    * Add coco, refcoco support

    * Fix doc_to_visual error

    * Fix segmentation mask error

    * Add refcoco+, refcocog

    * Remove debug code

    * black lint

    * Remove unused code and scripts

    * Fix group stderr N/A error between str and int

    * Fix letter case issue

    * Update lmms_eval tasks and utils

    * Fix coco test_split name

    * Add llava-bench-in-the-wild support

    * Black codestyle, lint

    * Add COCO evaluation metric

    * Add refcoco, refcocog, refcoco+ evaluation kit

    * Add llava bench coco support

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

commit f9e48cec5493010a363b446b81a335ef1484e42f
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Wed Jan 17 20:26:58 2024 +0800

    Update utils.py (#6)

* Fix logging issue and remove unnecessary whitespace

* Add openai and pycocoevalcap dependencies

* Fix device mapping issue in Llava constructor

* Add support for truncating context in generation

* Update Llava model and evaluation configuration

* Update YAML configuration files

* Update YAML configuration files

* add otterhd and gemini models

* Add support for custom image aspect ratio in Llava model

* Add dataset_kwargs and max_gen_toks to YAML files

* Fix log_samples suffix typo and use hash for output name

* Refactor LMMS evaluation code and update LLAVA model properties

* matched response for mistral-llava

* Refactor logging in llava_aggregation function

* Print evaluation statistics instead of logging them

* Fix logging information in llava_aggregation function

* Add new models and dataset_kwargs for COCO tasks

* Update truncate_context parameter in Llava class constructor

* Update dataset_kwargs in YAML files

* Remove issue type tags from issue and pull request templates

* Refactor pope utils functions

* Update transformers dependency to version 4.36.2

* Revise llava-in-the-wild prompt for align

* Add default values for gen_kwargs in Llava class

* Fix formatting issues and import pdb for debugging

* Remove pdb.set_trace() and update default value for max_new_tokens

* Add llava loglikelihood

* Fix formatting and indentation issues in lmms_eval/api/metrics.py and lmms_eval/models/llava.py

* Update function to handle edge cases

This commit updates the function to handle edge cases, improving the overall reliability and robustness of the code.

* Update black version in pre-commit config

* Remove duplicate lines in gqa

* Another way to solve memory issue

* Handle exception in model generation

* Refactor pope_aggregate_results to use "score" key instead of "pope_accuracy"

* Update pope metrics aggregation functions

* Add model_to_prompt in pope.yaml

* Update pope.yaml configuration

* Refactor code to simplify construct_requests call

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>

* Add datetime to output name in cli_evaluate function

Add get_datetime_str function to utils.py

* Refactor pope_aggregate_f1_score function

* Fix datetime format in get_datetime_str function

* Update JSON dump indentation in cli_evaluate function

* Add datetime to output name in cli_evaluate function (#10)

* Revert "Add datetime to output name in cli_evaluate function"

This reverts commit ef26f78c46b50d8769a4fb6990b909162c2881c3.

* Add datetime to output name in cli_evaluate function

* [Datasets] Added POPE and Aligned. (#11)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* [Dataset] Add SEED-Bench, TextCaps, NoCaps (#12)

* Change coco from print to logger

* Add llava loglikelihood

* Add Nocaps support

* Fix pass through function

* Add textcaps support

* Fix textcaps eval image_id

* Add seedbench support

* Add seedbench ppl evaluation

* black lint

* [Datasets] Add four internal evaluation datasets (#13)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* Remove unused variable in mmvet_process_results function

* Remove unused imports in utils.py

* Refactor get_chat_response function to include retries for API requests

* Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

* Update prompt variable in lmms_eval tasks

* Refactor output_name variable in cli_evaluate function

* Fix logging message in mmvet_process_results function

* Update sleep time in get_chat_response function

* Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

* Refactor get_eval function to include retries

* Add token parameter to load_dataset function in gqa_doc_to_visual

* Refactor llava_process_results and llava_aggregation functions

* [Datasets] Add four internal evaluation datasets (#13)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* Remove unused variable in mmvet_process_results function

* Remove unused imports in utils.py

* Refactor get_chat_response function to include retries for API requests

* Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

* Update prompt variable in lmms_eval tasks

* Refactor output_name variable in cli_evaluate function

* Fix logging message in mmvet_process_results function

* Update sleep time in get_chat_response function

* Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

* Refactor get_eval function to include retries

* Add token parameter to load_dataset function in gqa_doc_to_visual

* Refactor llava_process_results and llava_aggregation functions

* add mmmu (#15)

* add mmme

* black

* add mmmu (#15)

* add mmme

* black

* [Memory issue] Solve memory issue for building context (#14)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* Remove unused variable in mmvet_process_results function

* Remove unused imports in utils.py

* Refactor get_chat_response function to include retries for API requests

* Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

* Update prompt variable in lmms_eval tasks

* Refactor output_name variable in cli_evaluate function

* Fix logging message in mmvet_process_results function

* Update sleep time in get_chat_response function

* Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

* Refactor get_eval function to include retries

* Add token parameter to load_dataset function in gqa_doc_to_visual

* Refactor llava_process_results and llava_aggregation functions

* Remove unused function llava_aggregation

* Refractor llava-bench aggregation code

* Add logs and scripts to .gitignore, and set image_aspect_ratio to original in scienceqa.yaml

* Update generation parameters in scienceqa.yaml

* Solve memory issue for building context

* Solved gather result error

* Update lmms_eval scienceqa_img config

* Fixed nocaps store results

* Revise seedbench prompt

* Squashed commit of the following:

commit 290126e6a269db4cca9b3544bd017d6c17012793
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Wed Jan 24 14:07:36 2024 +0800

    add mmmu (#15)

    * add mmme

    * black

commit 8b0227cd7b2602d096d773a01b2199d1f4110f22
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 10:00:33 2024 +0800

    [Datasets] Add four internal evaluation datasets (#13)

    * Update generation_kwargs in pope.yaml

    * Update pope_doc_to_text function

    * Remove unused variable in mmvet_process_results function

    * Remove unused imports in utils.py

    * Refactor get_chat_response function to include retries for API requests

    * Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

    * Update prompt variable in lmms_eval tasks

    * Refactor output_name variable in cli_evaluate function

    * Fix logging message in mmvet_process_results function

    * Update sleep time in get_chat_response function

    * Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

    * Refactor get_eval function to include retries

    * Add token parameter to load_dataset function in gqa_doc_to_visual

    * Refactor llava_process_results and llava_aggregation functions

commit fec494dbe5971e8fa5a886b191a4781be3ce7a6f
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Tue Jan 23 19:17:40 2024 +0800

    [Dataset] Add SEED-Bench, TextCaps, NoCaps (#12)

    * Change coco from print to logger

    * Add llava loglikelihood

    * Add Nocaps support

    * Fix pass through function

    * Add textcaps support

    * Fix textcaps eval image_id

    * Add seedbench support

    * Add seedbench ppl evaluation

    * black lint

commit 4c3c2c63a681f29c537c2467957de1a90568748d
Author: Li Bo <drluodian@gmail.com>
Date:   Tue Jan 23 19:17:12 2024 +0800

    [Datasets] Added POPE and Aligned. (#11)

    * Update generation_kwargs in pope.yaml

    * Update pope_doc_to_text function

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* [Memory issue] Solve memory issue for building context (#14)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* Remove unused variable in mmvet_process_results function

* Remove unused imports in utils.py

* Refactor get_chat_response function to include retries for API requests

* Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

* Update prompt variable in lmms_eval tasks

* Refactor output_name variable in cli_evaluate function

* Fix logging message in mmvet_process_results function

* Update sleep time in get_chat_response function

* Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

* Refactor get_eval function to include retries

* Add token parameter to load_dataset function in gqa_doc_to_visual

* Refactor llava_process_results and llava_aggregation functions

* Remove unused function llava_aggregation

* Refractor llava-bench aggregation code

* Add logs and scripts to .gitignore, and set image_aspect_ratio to original in scienceqa.yaml

* Update generation parameters in scienceqa.yaml

* Solve memory issue for building context

* Solved gather result error

* Update lmms_eval scienceqa_img config

* Fixed nocaps store results

* Revise seedbench prompt

* Squashed commit of the following:

commit c3cc24a89415aeccad31ccbb10642af677cd6fe5
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Wed Jan 24 14:07:36 2024 +0800

    add mmmu (#15)

    * add mmme

    * black

commit 0dbc5d16c4f45ebea8def5f0bc1a36fcd93f9a05
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 10:00:33 2024 +0800

    [Datasets] Add four internal evaluation datasets (#13)

    * Update generation_kwargs in pope.yaml

    * Update pope_doc_to_text function

    * Remove unused variable in mmvet_process_results function

    * Remove unused imports in utils.py

    * Refactor get_chat_response function to include retries for API requests

    * Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

    * Update prompt variable in lmms_eval tasks

    * Refactor output_name variable in cli_evaluate function

    * Fix logging message in mmvet_process_results function

    * Update sleep time in get_chat_response function

    * Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

    * Refactor get_eval function to include retries

    * Add token parameter to load_dataset function in gqa_doc_to_visual

    * Refactor llava_process_results and llava_aggregation functions

commit fec494dbe5971e8fa5a886b191a4781be3ce7a6f
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Tue Jan 23 19:17:40 2024 +0800

    [Dataset] Add SEED-Bench, TextCaps, NoCaps (#12)

    * Change coco from print to logger

    * Add llava loglikelihood

    * Add Nocaps support

    * Fix pass through function

    * Add textcaps support

    * Fix textcaps eval image_id

    * Add seedbench support

    * Add seedbench ppl evaluation

    * black lint

commit 4c3c2c63a681f29c537c2467957de1a90568748d
Author: Li Bo <drluodian@gmail.com>
Date:   Tue Jan 23 19:17:12 2024 +0800

    [Datasets] Added POPE and Aligned. (#11)

    * Update generation_kwargs in pope.yaml

    * Update pope_doc_to_text function

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* Add output path file naming convention (#16)

Update datetime format in get_datetime_str() function

* Add output path file naming convention (#16)

Update datetime format in get_datetime_str() function

* [Datasets] modify NoCaps data path and prompts (#17)

* Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

* Update dataset paths and improve user prompts

* [Datasets] modify NoCaps data path and prompts (#17)

* Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

* Update dataset paths and improve user prompts

* [Dataset] Add flickr30k (#18)

* Add flickr30k support

* Black lint

* Align prompt with NoCaps

* [Dataset] Add flickr30k (#18)

* Add flickr30k support

* Black lint

* Align prompt with NoCaps

* add model specific prompt and gen kwargs in sqa (#19)

* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* add model specific prompt and gen kwargs in sqa (#19)

* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* Dev/add chartqa and ai2d (#23)

* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* add chartqa

* black

* add ai2d

* black

* update chartqa

* blacl

* update ai2d dataset

* black

* Add 'submissions/' directory to .gitignore

* Add Python setup and Black version installation workflow
Refactor ContextSampler class in samplers.py
Remove unnecessary line in DecontaminationFilter class
Update dependencies in pyproject.toml

* Refactor code in ContextSampler class

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* Dev/add chartqa and ai2d (#23)

* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* add chartqa

* black

* add ai2d

* black

* update chartqa

* blacl

* update ai2d dataset

* black

* Add 'submissions/' directory to .gitignore

* Add Python setup and Black version installation workflow
Refactor ContextSampler class in samplers.py
Remove unnecessary line in DecontaminationFilter class
Update dependencies in pyproject.toml

* Refactor code in ContextSampler class

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* [Datasets] Changes for Flickr30K and NoCaps, also merged Peiyuan's Model Specific Prompt. (#20)

* Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 2626383d99b5eac59d531ca0f293df960570c524
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 842fbc6f2da7d9a118adf9ec27c3d8542d74168e
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4bf0504fabc3b62f356c467b2fd1119083d27313
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

    * Update dataset paths and improve user prompts

commit 520c7a2cafe60810aca79df814ce6829d4576032
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 2626383d99b5eac59d531ca0f293df960570c524
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 520c7a2cafe60810aca79df814ce6829d4576032
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit a2cc9303dc72e4d53983bb56e54a32e977c3e270
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit 35e87e7c7a480d005abf607c2527a35457d92311
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 89755323596b85208ed33aa88c296604a39af6eb
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 0b0d30dfb247c5f0b7b68398b9e9fcde74cf7fa2
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit e273f9cbd91540df86bdbc652bff88a847bd0d2d
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit e84126aaaf8a07bd371a0571a914ccbcd3697f20
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 110deab53dc1a2fd349b1872cd261b69074c5fa8
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit 0fa3e0c40075997ea80ed976bdee9615f17d3ece
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit 2aaca579120def99860f90054233f3358950fa66
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Add download configuration for dataset

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* add datasets

* Update gpt_eval_model_name in mathvista.yaml

* Merge commit '817eb057bcb61226b33d3ac3c8def01c36c90f96'

* Update pyproject.toml with dependencies and URLs

* Squashed commit of the following:

commit f253968ad703f682a29317bdd51ec6c1fd7c5465
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Sat Jan 27 13:56:37 2024 +0800

    Dev/add chartqa and ai2d (#23)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * Add 'submissions/' directory to .gitignore

    * Add Python setup and Black version installation workflow
    Refactor ContextSampler class in samplers.py
    Remove unnecessary line in DecontaminationFilter class
    Update dependencies in pyproject.toml

    * Refactor code in ContextSampler class

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

* Refactor image processing and submission file path

* Refactor directory creation logic in cli_evaluate_single function

* Update dataset path and test split in vqav2.yaml

* Remove "total" column from cap_details_columns DataFrame

* Add retry logic for dataset download

* Add 'tenacity' to dependencies in pyproject.toml

* Refactor code in ContextSampler class

* Update Black version and configuration, and improve code readability in ContextSampler

* Update Black version and line length

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
Co-authored-by: Fanyi Pu <FPU001@e.ntu.edu.sg>

* [Datasets] Changes for Flickr30K and NoCaps, also merged Peiyuan's Model Specific Prompt. (#20)

* Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit ecb47d73d6e000b472be6c5c0cdc9413c7734384
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit dc23f4b42b1dd60b41904d7ddbee1412d6851077
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

    * Update dataset paths and improve user prompts

commit 5f55126484a7c9325db586d26cf2052538222804
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 5f55126484a7c9325db586d26cf2052538222804
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 18e984cfe173390843c73048a931baa17800f918
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit 0df825c9e72a06e6acb4c0bd43c2083ffe8b74c0
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit b9d9f9896993033b92346e9f47420c55b866c715
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 4256bef410e4c8d8761e0cd0d79ac5e57b97651b
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit 18e984cfe173390843c73048a931baa17800f918
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 0c8a3919885b8fe2880bb2892f7a619d060012d1
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit d2bc7c92ac61179b8c4031e11bc31970355252f6
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit c78fa29cd0d161641ee05db57bd39314b998c8c7
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 397f0906968fd8ba04b883469b96217737c43e09
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit 52a7ea6c7599adeec2ac2787f500e215ce47cf79
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit f706b2aaf9b288c582611191a1841b58feaeb741
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit 18e984cfe173390843c73048a931baa17800f918
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Add download configuration for dataset

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* add datasets

* Update gpt_eval_model_name in mathvista.yaml

* Merge commit '0d620f98b49f8204d02633f209eedd5d8b7a1f7c'

* Update pyproject.toml with dependencies and URLs

* Squashed commit of the following:

commit 8b600f55b6cf5627504c407871539db59f6085a3
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Sat Jan 27 13:56:37 2024 +0800

    Dev/add chartqa and ai2d (#23)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * Add 'submissions/' directory to .gitignore

    * Add Python setup and Black version installation workflow
    Refactor ContextSampler class in samplers.py
    Remove unnecessary line in DecontaminationFilter class
    Update dependencies in pyproject.toml

    * Refactor code in ContextSampler class

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

* Refactor image processing and submission file path

* Refactor directory creation logic in cli_evaluate_single function

* Update dataset path and test split in vqav2.yaml

* Remove "total" column from cap_details_columns DataFrame

* Add retry logic for dataset download

* Add 'tenacity' to dependencies in pyproject.toml

* Refactor code in ContextSampler class

* Update Black version and configuration, and improve code readability in ContextSampler

* Update Black version and line length

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
Co-authored-by: Fanyi Pu <FPU001@e.ntu.edu.sg>

* vqav2 (#25)

* Update tqdm progress bar position

* Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

* Squashed commit of the following:

commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

commit 842fbc6f2da7d9a118adf9ec27c3d8542d74168e
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4bf0504fabc3b62f356c467b2fd1119083d27313
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

    * Update dataset paths and improve user prompts

commit 767f7e2cae60cf67ec5878234d84321395a3ed15
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* remove useless output file

* Update dataset path in vqav2.yaml

* Squashed commit of the following:

commit eeb2b9827502f044ef67d8440f53124baf219ba3
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:56:45 2024 +0800

    Black lint

commit 1ce9f0b37e4bc5e6ff5fbfcd23fd339eb14974ae
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:47 2024 +0800

    Solve doc_iterator_for_counting crashing issue

commit e12b3bb41ed4f51540cfac84e5e96d15777540c4
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:13 2024 +0800

    Exclude train in refcoco/+/g config

commit 42c56f82bc4ccae12e19e76d09d7e525ca9ef2f4
Merge: 6a1ae69 697a438
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:13 2024 +0000

    Merge branch 'dev/bli_add_datasets' of https://github.com/EvolvingLMMs-Lab/lmms-eval into dev/bli_add_datasets

commit aed08303fe87808986d206540a0c0ee6d8764988
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:06 2024 +0000

    Fix file path and raise error if config file does not exist

commit a105386613c443d9e740c89725cbd1281bbdfef6
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 00:47:24 2024 +0800

    Fix tasks issue for nocaps, refcoco/+/g

commit 21c8119e377760f44c769bed2528d863a8f4333b
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 10:09:43 2024 +0000

    Remove unused files and update task configuration

commit 0ccb2629c2aacdb297b7cf0c9c2bcfa386bb7582
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:43:56 2024 +0000

    Add submission file for coco, flickr30k, nocaps, and textcaps tasks

commit 5365e13e93c702a1e0e259ee6a08d6a427d72470
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:32:54 2024 +0000

    Refactor get_task_dict function to handle nested groups

commit 6773348c807bcfa1b09ceffc90c75e15cad908f7
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:13:46 2024 +0000

    Fix bug in login functionality

    Refactor code for better performance

    Add new feature for user authentication

    Update UI layout for improved user experience

    Fix typo in variable name

    Optimize database queries for faster response time

    Add error handling for edge cases

    Update dependencies to latest versions

    Remove unused code

    Improve code readability and maintainability

commit 31140f9c87dea89ca94c94bc850e3a8d43e5f8b4
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 17:07:20 2024 +0800

    Fix cli itself can not run with config file

commit df1bad47f6ed13f94848d2bee29b28e00c2384b2
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:09:04 2024 +0000

    Squashed commit of the following:

    commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
    Author: Zhang Peiyuan <a1286225768@gmail.com>
    Date:   Thu Jan 25 17:08:25 2024 +0800

        add model specific prompt and gen kwargs in sqa (#19)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

commit 06383aa4a5ff59db52fc8d584f3086efd88b7e74
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:02:57 2024 +0000

    Squashed commit of the following:

    commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit 2626383d99b5eac59d531ca0f293df960570c524
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 520c7a2cafe60810aca79df814ce6829d4576032
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit 7a71fd6022ee5985100dda38b94956595cec77a5
Merge: 22c3adf 1d3fdd4
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:43:15 2024 +0000

    Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e' into dev/bli_add_datasets

commit 6870cba13cb54976480c1d5e8d97602c246f881b
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:52 2024 +0000

    Squashed commit of the following:

    commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit 2626383d99b5eac59d531ca0f293df960570c524
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 842fbc6f2da7d9a118adf9ec27c3d8542d74168e
    Author: kcz358 <92624596+kcz358@users.noreply.github.com>
    Date:   Thu Jan 25 09:47:31 2024 +0800

        [Dataset] Add flickr30k (#18)

        * Add flickr30k support

        * Black lint

        * Align prompt with NoCaps

    commit 4bf0504fabc3b62f356c467b2fd1119083d27313
    Author: Li Bo <drluodian@gmail.com>
    Date:   Wed Jan 24 22:10:14 2024 +0800

        [Datasets] modify NoCaps data path and prompts (#17)

        * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

        * Update dataset paths and improve user prompts

    commit 520c7a2cafe60810aca79df814ce6829d4576032
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit b40d522b6bf483ebdfbf5facd4573de0cf8a93f6
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:11 2024 +0000

    Add coco_val and coco_test tasks to coco.yaml

commit 5bf643f73d06f1e540897b753450352bb92fd9ec
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 04:58:28 2024 +0000

    Update dataset_path in flickr30k.yaml

commit 95f110f0eef5196205bc501367e3642c57cc7a17
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:12:25 2024 +0000

    Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

commit c844ae49b18c1334711832208b0359c9439fe1c0
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:10:18 2024 +0000

    Add submission folder and update file paths for storing prediction results

commit 842fbc6f2da7d9a118adf9ec27c3d8542d74168e
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4bf0504fabc3b62f356c467b2fd1119083d27313
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

    * Update dataset paths and improve user prompts

commit f0446227f0dd93651e9d6c06254bbf5212ede2dd
Merge: c6370bf 51f2eaa
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:07 2024 +0800

    Merge branch 'main' into dev/bli_add_datasets

commit 1e1f6cfccba758dc606fa4217102518fab73c936
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 14:08:06 2024 +0000

    Update dataset paths and improve user prompts

commit 966933754b9e5179995b3ab41d746603e13e75c6
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 11:52:33 2024 +0000

    Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

commit 767f7e2cae60cf67ec5878234d84321395a3ed15
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* Fix bug in login functionality

* create vqav2_val

* Update vqav2_test.yaml

* Update vqav2_test.yaml

* Update vqav2_val.yaml

---------

Co-authored-by: Li Bo <drluodian@gmail.com>

* vqav2 (#25)

* Update tqdm progress bar position

* Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384'

* Squashed commit of the following:

commit 18e984cfe173390843c73048a931baa17800f918
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

commit ecb47d73d6e000b472be6c5c0cdc9413c7734384
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit dc23f4b42b1dd60b41904d7ddbee1412d6851077
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

    * Update dataset paths and improve user prompts

commit 95ef3ea519cbd772924f9a6afa5394979eb00432
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* remove useless output file

* Update dataset path in vqav2.yaml

* Squashed commit of the following:

commit 75bb7043ea5a533ab6351fc0f5ab055e86106423
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:56:45 2024 +0800

    Black lint

commit 6635a8aa34cfbd3c7a4afb6fcd214a7283ce01cb
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:47 2024 +0800

    Solve doc_iterator_for_counting crashing issue

commit 080f42b88ea8acacd527b8d67b84ba1d7d135b03
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:13 2024 +0800

    Exclude train in refcoco/+/g config

commit 4da84069c08c95e49e8ab0e64a1e103ff7ac8730
Merge: 6a1ae69 697a438
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:13 2024 +0000

    Merge branch 'dev/bli_add_datasets' of https://github.com/EvolvingLMMs-Lab/lmms-eval into dev/bli_add_datasets

commit 6a1ae69923d79ae32a001edac38206b605274ec3
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:06 2024 +0000

    Fix file path and raise error if config file does not exist

commit 697a4387827ceeec3e393237dd1baa217c714c88
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 00:47:24 2024 +0800

    Fix tasks issue for nocaps, refcoco/+/g

commit 47e40437126d39a5f062c9a33b4de426c1a29804
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 10:09:43 2024 +0000

    Remove unused files and update task configuration

commit 9976eb8e9ed03c8613725fdbd822ef5d8cf70e47
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:43:56 2024 +0000

    Add submission file for coco, flickr30k, nocaps, and textcaps tasks

commit 95f97a69faa6129676e89eee14960fcfe2076b7c
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:32:54 2024 +0000

    Refactor get_task_dict function to handle nested groups

commit 3b79ee842b2488714baf92ab34528ef77989d392
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:13:46 2024 +0000

    Fix bug in login functionality

    Refactor code for better performance

    Add new feature for user authentication

    Update UI layout for improved user experience

    Fix typo in variable name

    Optimize database queries for faster response time

    Add error handling for edge cases

    Update dependencies to latest versions

    Remove unused code

    Improve code readability and maintainability

commit f5c353f2ce93a2d96add4312b695b57432f68cbb
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 17:07:20 2024 +0800

    Fix cli itself can not run with config file

commit 9a68fec37be74cfe8d4a73390bc83edee147ae24
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:09:04 2024 +0000

    Squashed commit of the following:

    commit 18e984cfe173390843c73048a931baa17800f918
    Author: Zhang Peiyuan <a1286225768@gmail.com>
    Date:   Thu Jan 25 17:08:25 2024 +0800

        add model specific prompt and gen kwargs in sqa (#19)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

commit 93f847c5851fd246716367935d6b807b17d53949
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:02:57 2024 +0000

    Squashed commit of the following:

    commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit 63739fc6fa0a462d807ae81de0db0173102de584
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit edcc752f97ea3845cefad56624e5d2855066f680
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 5f55126484a7c9325db586d26cf2052538222804
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit fa4ad4404e26d8924f55208746dbb9143b464011
Merge: 22c3adf 1d3fdd4
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:43:15 2024 +0000

    Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384' into dev/bli_add_datasets

commit 22c3adfd0645acc23b6d7c06b487f4ffd47666c4
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:52 2024 +0000

    Squashed commit of the following:

    commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit 63739fc6fa0a462d807ae81de0db0173102de584
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit edcc752f97ea3845cefad56624e5d2855066f680
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit ecb47d73d6e000b472be6c5c0cdc9413c7734384
    Author: kcz358 <92624596+kcz358@users.noreply.github.com>
    Date:   Thu Jan 25 09:47:31 2024 +0800

        [Dataset] Add flickr30k (#18)

        * Add flickr30k support

        * Black lint

        * Align prompt with NoCaps

    commit dc23f4b42b1dd60b41904d7ddbee1412d6851077
    Author: Li Bo <drluodian@gmail.com>
    Date:   Wed Jan 24 22:10:14 2024 +0800

        [Datasets] modify NoCaps data path and prompts (#17)

        * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

        * Update dataset paths and improve user prompts

    commit 5f55126484a7c9325db586d26cf2052538222804
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit 4c712336b6f7438e717a865910bb241e413a4688
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:11 2024 +0000

    Add coco_val and coco_test tasks to coco.yaml

commit b5547126c855927fd4dc8384211e4aceee40870f
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 04:58:28 2024 +0000

    Update dataset_path in flickr30k.yaml

commit f786f61e2559f082072f21aa9030e2080ddaf809
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:12:25 2024 +0000

    Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384'

commit 796a011000e0df90f66f8e80cb34dc2318ae9ac8
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:10:18 2024 +0000

    Add submission folder and update file paths for storing prediction results

commit ecb47d73d6e000b472be6c5c0cdc9413c7734384
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit dc23f4b42b1dd60b41904d7ddbee1412d6851077
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

    * Update dataset paths and improve user prompts

commit 118744c63eb2d9724571d85fbbd85fcc9ad05b59
Merge: c6370bf 51f2eaa
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:07 2024 +0800

    Merge branch 'main' into dev/bli_add_datasets

commit c6370bff65903681f00cf3d07111d8e15a57b619
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 14:08:06 2024 +0000

    Update dataset paths and improve user prompts

commit 810daf458fa94cb3ec2b4a6cc5ecb1e656a24002
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 11:52:33 2024 +0000

    Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

commit 95ef3ea519cbd772924f9a6afa5394979eb00432
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* Fix bug in login functionality

* create vqav2_val

* Update vqav2_test.yaml

* Update vqav2_test.yaml

* Update vqav2_val.yaml

---------

Co-authored-by: Li Bo <drluodian@gmail.com>

* vizwiz dataset (#24)

* Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 2626383d99b5eac59d531ca0f293df960570c524
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit d4e8e2552d40752…
Luodian added a commit that referenced this issue Jun 12, 2024
* Add COCO, RefCOCO, RefCOCO+, RefCOCOg (#5)

* Update author name and email in pyproject.toml

* add mmvet and try to modify llava arch

* Add coco, refcoco support

* Fix doc_to_visual error

* Fix segmentation mask error

* Add refcoco+, refcocog

* Remove debug code

* black lint

* Remove unused code and scripts

* Fix group stderr N/A error between str and int

* Fix letter case issue

* Update lmms_eval tasks and utils

* Fix coco test_split name

* Add llava-bench-in-the-wild support

* Black codestyle, lint

* Add COCO evaluation metric

* Add refcoco, refcocog, refcoco+ evaluation kit

* Add llava bench coco support

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* VQAv2 eval (#4)

* vqav2

* Add vqav2_process_results function and update vqav2_doc_to_text function

* Implement vqav2_process_results function to return exact match score

* Refactor fewshot_docs() to use config.fewshot_config

* Refactor Task class to handle fewshot_docs when training and validation docs are not available

* Add answer processing logic in vqav2_process_results function

* Refactor vqav2_process_results function and add submission aggregation

* Add vqav2_aggreate_submissions function to utils.py

* textvqa

* Refactor answer processing in textvqa_process_results() function

* textvqa eval

* Update dataset path and modify textvqa_doc_to_text function

* Capitalize the question in textvqa_doc_to_text function

* Update textvqa.yaml and utils.py

* Fix formatting issues in lmms_eval/api/task.py, lmms_eval/tasks/gqa/utils.py, lmms_eval/tasks/textvqa/utils.py, and lmms_eval/tasks/vqav2/utils.py

---------

Co-authored-by: Li Bo <drluodian@gmail.com>

* [Big Changes] add LLaVA-1.6, MMVet, LLaVA-W, POPE, and many other changes on logs, model args. (#7)

* Update author name and email in pyproject.toml

* add mmvet and try to modify llava arch

* black lint

* Remove unused code and scripts

* Update lmms_eval tasks and utils

* Update LMMS-Eval dependencies and configurations

* Squashed commit of the following:

commit 209f3904f33210bec0b4b146e96fcbd67a4e1541
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Wed Jan 17 20:27:13 2024 +0800

    Add COCO, RefCOCO, RefCOCO+, RefCOCOg (#5)

    * Update author name and email in pyproject.toml

    * add mmvet and try to modify llava arch

    * Add coco, refcoco support

    * Fix doc_to_visual error

    * Fix segmentation mask error

    * Add refcoco+, refcocog

    * Remove debug code

    * black lint

    * Remove unused code and scripts

    * Fix group stderr N/A error between str and int

    * Fix letter case issue

    * Update lmms_eval tasks and utils

    * Fix coco test_split name

    * Add llava-bench-in-the-wild support

    * Black codestyle, lint

    * Add COCO evaluation metric

    * Add refcoco, refcocog, refcoco+ evaluation kit

    * Add llava bench coco support

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

commit f9e48cec5493010a363b446b81a335ef1484e42f
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Wed Jan 17 20:26:58 2024 +0800

    Update utils.py (#6)

* Fix logging issue and remove unnecessary whitespace

* Add openai and pycocoevalcap dependencies

* Fix device mapping issue in Llava constructor

* Add support for truncating context in generation

* Update Llava model and evaluation configuration

* Update YAML configuration files

* Update YAML configuration files

* add otterhd and gemini models

* Add support for custom image aspect ratio in Llava model

* Add dataset_kwargs and max_gen_toks to YAML files

* Fix log_samples suffix typo and use hash for output name

* Refactor LMMS evaluation code and update LLAVA model properties

* matched response for mistral-llava

* Refactor logging in llava_aggregation function

* Print evaluation statistics instead of logging them

* Fix logging information in llava_aggregation function

* Add new models and dataset_kwargs for COCO tasks

* Update truncate_context parameter in Llava class constructor

* Update dataset_kwargs in YAML files

* Remove issue type tags from issue and pull request templates

* add mmvet and try to modify llava arch

* black lint

* Update lmms_eval tasks and utils

* Update LMMS-Eval dependencies and configurations

* Squashed commit of the following:

commit 209f3904f33210bec0b4b146e96fcbd67a4e1541
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Wed Jan 17 20:27:13 2024 +0800

    Add COCO, RefCOCO, RefCOCO+, RefCOCOg (#5)

    * Update author name and email in pyproject.toml

    * add mmvet and try to modify llava arch

    * Add coco, refcoco support

    * Fix doc_to_visual error

    * Fix segmentation mask error

    * Add refcoco+, refcocog

    * Remove debug code

    * black lint

    * Remove unused code and scripts

    * Fix group stderr N/A error between str and int

    * Fix letter case issue

    * Update lmms_eval tasks and utils

    * Fix coco test_split name

    * Add llava-bench-in-the-wild support

    * Black codestyle, lint

    * Add COCO evaluation metric

    * Add refcoco, refcocog, refcoco+ evaluation kit

    * Add llava bench coco support

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

commit f9e48cec5493010a363b446b81a335ef1484e42f
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Wed Jan 17 20:26:58 2024 +0800

    Update utils.py (#6)

* Fix logging issue and remove unnecessary whitespace

* Add openai and pycocoevalcap dependencies

* Fix device mapping issue in Llava constructor

* Add support for truncating context in generation

* Update Llava model and evaluation configuration

* Update YAML configuration files

* Update YAML configuration files

* add otterhd and gemini models

* Add support for custom image aspect ratio in Llava model

* Add dataset_kwargs and max_gen_toks to YAML files

* Fix log_samples suffix typo and use hash for output name

* Refactor LMMS evaluation code and update LLAVA model properties

* matched response for mistral-llava

* Refactor logging in llava_aggregation function

* Print evaluation statistics instead of logging them

* Fix logging information in llava_aggregation function

* Add new models and dataset_kwargs for COCO tasks

* Update truncate_context parameter in Llava class constructor

* Update dataset_kwargs in YAML files

* Remove issue type tags from issue and pull request templates

* Refactor pope utils functions

* Update transformers dependency to version 4.36.2

* Revise llava-in-the-wild prompt for align

* Add default values for gen_kwargs in Llava class

* Fix formatting issues and import pdb for debugging

* Remove pdb.set_trace() and update default value for max_new_tokens

* Add llava loglikelihood

* Fix formatting and indentation issues in lmms_eval/api/metrics.py and lmms_eval/models/llava.py

* Update function to handle edge cases

This commit updates the function to handle edge cases, improving the overall reliability and robustness of the code.

* Update black version in pre-commit config

* Remove duplicate lines in gqa

* Another way to solve memory issue

* Handle exception in model generation

* Refactor pope_aggregate_results to use "score" key instead of "pope_accuracy"

* Update pope metrics aggregation functions

* Add model_to_prompt in pope.yaml

* Update pope.yaml configuration

* Refactor code to simplify construct_requests call

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>

* Add datetime to output name in cli_evaluate function

Add get_datetime_str function to utils.py

* Refactor pope_aggregate_f1_score function

* Fix datetime format in get_datetime_str function

* Update JSON dump indentation in cli_evaluate function

* Add datetime to output name in cli_evaluate function (#10)

* Revert "Add datetime to output name in cli_evaluate function"

This reverts commit ef26f78c46b50d8769a4fb6990b909162c2881c3.

* Add datetime to output name in cli_evaluate function

* [Datasets] Added POPE and Aligned. (#11)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* [Dataset] Add SEED-Bench, TextCaps, NoCaps (#12)

* Change coco from print to logger

* Add llava loglikelihood

* Add Nocaps support

* Fix pass through function

* Add textcaps support

* Fix textcaps eval image_id

* Add seedbench support

* Add seedbench ppl evaluation

* black lint

* [Datasets] Add four internal evaluation datasets (#13)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* Remove unused variable in mmvet_process_results function

* Remove unused imports in utils.py

* Refactor get_chat_response function to include retries for API requests

* Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

* Update prompt variable in lmms_eval tasks

* Refactor output_name variable in cli_evaluate function

* Fix logging message in mmvet_process_results function

* Update sleep time in get_chat_response function

* Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

* Refactor get_eval function to include retries

* Add token parameter to load_dataset function in gqa_doc_to_visual

* Refactor llava_process_results and llava_aggregation functions

* [Datasets] Add four internal evaluation datasets (#13)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* Remove unused variable in mmvet_process_results function

* Remove unused imports in utils.py

* Refactor get_chat_response function to include retries for API requests

* Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

* Update prompt variable in lmms_eval tasks

* Refactor output_name variable in cli_evaluate function

* Fix logging message in mmvet_process_results function

* Update sleep time in get_chat_response function

* Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

* Refactor get_eval function to include retries

* Add token parameter to load_dataset function in gqa_doc_to_visual

* Refactor llava_process_results and llava_aggregation functions

* add mmmu (#15)

* add mmme

* black

* add mmmu (#15)

* add mmme

* black

* [Memory issue] Solve memory issue for building context (#14)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* Remove unused variable in mmvet_process_results function

* Remove unused imports in utils.py

* Refactor get_chat_response function to include retries for API requests

* Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

* Update prompt variable in lmms_eval tasks

* Refactor output_name variable in cli_evaluate function

* Fix logging message in mmvet_process_results function

* Update sleep time in get_chat_response function

* Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

* Refactor get_eval function to include retries

* Add token parameter to load_dataset function in gqa_doc_to_visual

* Refactor llava_process_results and llava_aggregation functions

* Remove unused function llava_aggregation

* Refractor llava-bench aggregation code

* Add logs and scripts to .gitignore, and set image_aspect_ratio to original in scienceqa.yaml

* Update generation parameters in scienceqa.yaml

* Solve memory issue for building context

* Solved gather result error

* Update lmms_eval scienceqa_img config

* Fixed nocaps store results

* Revise seedbench prompt

* Squashed commit of the following:

commit 290126e6a269db4cca9b3544bd017d6c17012793
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Wed Jan 24 14:07:36 2024 +0800

    add mmmu (#15)

    * add mmme

    * black

commit 8b0227cd7b2602d096d773a01b2199d1f4110f22
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 10:00:33 2024 +0800

    [Datasets] Add four internal evaluation datasets (#13)

    * Update generation_kwargs in pope.yaml

    * Update pope_doc_to_text function

    * Remove unused variable in mmvet_process_results function

    * Remove unused imports in utils.py

    * Refactor get_chat_response function to include retries for API requests

    * Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

    * Update prompt variable in lmms_eval tasks

    * Refactor output_name variable in cli_evaluate function

    * Fix logging message in mmvet_process_results function

    * Update sleep time in get_chat_response function

    * Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

    * Refactor get_eval function to include retries

    * Add token parameter to load_dataset function in gqa_doc_to_visual

    * Refactor llava_process_results and llava_aggregation functions

commit fec494dbe5971e8fa5a886b191a4781be3ce7a6f
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Tue Jan 23 19:17:40 2024 +0800

    [Dataset] Add SEED-Bench, TextCaps, NoCaps (#12)

    * Change coco from print to logger

    * Add llava loglikelihood

    * Add Nocaps support

    * Fix pass through function

    * Add textcaps support

    * Fix textcaps eval image_id

    * Add seedbench support

    * Add seedbench ppl evaluation

    * black lint

commit 4c3c2c63a681f29c537c2467957de1a90568748d
Author: Li Bo <drluodian@gmail.com>
Date:   Tue Jan 23 19:17:12 2024 +0800

    [Datasets] Added POPE and Aligned. (#11)

    * Update generation_kwargs in pope.yaml

    * Update pope_doc_to_text function

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* [Memory issue] Solve memory issue for building context (#14)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* Remove unused variable in mmvet_process_results function

* Remove unused imports in utils.py

* Refactor get_chat_response function to include retries for API requests

* Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

* Update prompt variable in lmms_eval tasks

* Refactor output_name variable in cli_evaluate function

* Fix logging message in mmvet_process_results function

* Update sleep time in get_chat_response function

* Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

* Refactor get_eval function to include retries

* Add token parameter to load_dataset function in gqa_doc_to_visual

* Refactor llava_process_results and llava_aggregation functions

* Remove unused function llava_aggregation

* Refractor llava-bench aggregation code

* Add logs and scripts to .gitignore, and set image_aspect_ratio to original in scienceqa.yaml

* Update generation parameters in scienceqa.yaml

* Solve memory issue for building context

* Solved gather result error

* Update lmms_eval scienceqa_img config

* Fixed nocaps store results

* Revise seedbench prompt

* Squashed commit of the following:

commit c3cc24a89415aeccad31ccbb10642af677cd6fe5
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Wed Jan 24 14:07:36 2024 +0800

    add mmmu (#15)

    * add mmme

    * black

commit 0dbc5d16c4f45ebea8def5f0bc1a36fcd93f9a05
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 10:00:33 2024 +0800

    [Datasets] Add four internal evaluation datasets (#13)

    * Update generation_kwargs in pope.yaml

    * Update pope_doc_to_text function

    * Remove unused variable in mmvet_process_results function

    * Remove unused imports in utils.py

    * Refactor get_chat_response function to include retries for API requests

    * Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

    * Update prompt variable in lmms_eval tasks

    * Refactor output_name variable in cli_evaluate function

    * Fix logging message in mmvet_process_results function

    * Update sleep time in get_chat_response function

    * Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

    * Refactor get_eval function to include retries

    * Add token parameter to load_dataset function in gqa_doc_to_visual

    * Refactor llava_process_results and llava_aggregation functions

commit fec494dbe5971e8fa5a886b191a4781be3ce7a6f
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Tue Jan 23 19:17:40 2024 +0800

    [Dataset] Add SEED-Bench, TextCaps, NoCaps (#12)

    * Change coco from print to logger

    * Add llava loglikelihood

    * Add Nocaps support

    * Fix pass through function

    * Add textcaps support

    * Fix textcaps eval image_id

    * Add seedbench support

    * Add seedbench ppl evaluation

    * black lint

commit 4c3c2c63a681f29c537c2467957de1a90568748d
Author: Li Bo <drluodian@gmail.com>
Date:   Tue Jan 23 19:17:12 2024 +0800

    [Datasets] Added POPE and Aligned. (#11)

    * Update generation_kwargs in pope.yaml

    * Update pope_doc_to_text function

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* Add output path file naming convention (#16)

Update datetime format in get_datetime_str() function

* Add output path file naming convention (#16)

Update datetime format in get_datetime_str() function

* [Datasets] modify NoCaps data path and prompts (#17)

* Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

* Update dataset paths and improve user prompts

* [Datasets] modify NoCaps data path and prompts (#17)

* Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

* Update dataset paths and improve user prompts

* [Dataset] Add flickr30k (#18)

* Add flickr30k support

* Black lint

* Align prompt with NoCaps

* [Dataset] Add flickr30k (#18)

* Add flickr30k support

* Black lint

* Align prompt with NoCaps

* add model specific prompt and gen kwargs in sqa (#19)

* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* add model specific prompt and gen kwargs in sqa (#19)

* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* Dev/add chartqa and ai2d (#23)

* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* add chartqa

* black

* add ai2d

* black

* update chartqa

* blacl

* update ai2d dataset

* black

* Add 'submissions/' directory to .gitignore

* Add Python setup and Black version installation workflow
Refactor ContextSampler class in samplers.py
Remove unnecessary line in DecontaminationFilter class
Update dependencies in pyproject.toml

* Refactor code in ContextSampler class

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* Dev/add chartqa and ai2d (#23)

* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* add chartqa

* black

* add ai2d

* black

* update chartqa

* blacl

* update ai2d dataset

* black

* Add 'submissions/' directory to .gitignore

* Add Python setup and Black version installation workflow
Refactor ContextSampler class in samplers.py
Remove unnecessary line in DecontaminationFilter class
Update dependencies in pyproject.toml

* Refactor code in ContextSampler class

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* [Datasets] Changes for Flickr30K and NoCaps, also merged Peiyuan's Model Specific Prompt. (#20)

* Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 2626383d99b5eac59d531ca0f293df960570c524
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 842fbc6f2da7d9a118adf9ec27c3d8542d74168e
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4bf0504fabc3b62f356c467b2fd1119083d27313
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

    * Update dataset paths and improve user prompts

commit 520c7a2cafe60810aca79df814ce6829d4576032
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 2626383d99b5eac59d531ca0f293df960570c524
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 520c7a2cafe60810aca79df814ce6829d4576032
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit a2cc9303dc72e4d53983bb56e54a32e977c3e270
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit 35e87e7c7a480d005abf607c2527a35457d92311
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 89755323596b85208ed33aa88c296604a39af6eb
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 0b0d30dfb247c5f0b7b68398b9e9fcde74cf7fa2
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit e273f9cbd91540df86bdbc652bff88a847bd0d2d
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit e84126aaaf8a07bd371a0571a914ccbcd3697f20
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 110deab53dc1a2fd349b1872cd261b69074c5fa8
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit 0fa3e0c40075997ea80ed976bdee9615f17d3ece
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit 2aaca579120def99860f90054233f3358950fa66
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Add download configuration for dataset

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* add datasets

* Update gpt_eval_model_name in mathvista.yaml

* Merge commit '817eb057bcb61226b33d3ac3c8def01c36c90f96'

* Update pyproject.toml with dependencies and URLs

* Squashed commit of the following:

commit f253968ad703f682a29317bdd51ec6c1fd7c5465
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Sat Jan 27 13:56:37 2024 +0800

    Dev/add chartqa and ai2d (#23)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * Add 'submissions/' directory to .gitignore

    * Add Python setup and Black version installation workflow
    Refactor ContextSampler class in samplers.py
    Remove unnecessary line in DecontaminationFilter class
    Update dependencies in pyproject.toml

    * Refactor code in ContextSampler class

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

* Refactor image processing and submission file path

* Refactor directory creation logic in cli_evaluate_single function

* Update dataset path and test split in vqav2.yaml

* Remove "total" column from cap_details_columns DataFrame

* Add retry logic for dataset download

* Add 'tenacity' to dependencies in pyproject.toml

* Refactor code in ContextSampler class

* Update Black version and configuration, and improve code readability in ContextSampler

* Update Black version and line length

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
Co-authored-by: Fanyi Pu <FPU001@e.ntu.edu.sg>

* [Datasets] Changes for Flickr30K and NoCaps, also merged Peiyuan's Model Specific Prompt. (#20)

* Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit ecb47d73d6e000b472be6c5c0cdc9413c7734384
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit dc23f4b42b1dd60b41904d7ddbee1412d6851077
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

    * Update dataset paths and improve user prompts

commit 5f55126484a7c9325db586d26cf2052538222804
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 5f55126484a7c9325db586d26cf2052538222804
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 18e984cfe173390843c73048a931baa17800f918
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit 0df825c9e72a06e6acb4c0bd43c2083ffe8b74c0
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit b9d9f9896993033b92346e9f47420c55b866c715
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 4256bef410e4c8d8761e0cd0d79ac5e57b97651b
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit 18e984cfe173390843c73048a931baa17800f918
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 0c8a3919885b8fe2880bb2892f7a619d060012d1
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit d2bc7c92ac61179b8c4031e11bc31970355252f6
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit c78fa29cd0d161641ee05db57bd39314b998c8c7
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 397f0906968fd8ba04b883469b96217737c43e09
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit 52a7ea6c7599adeec2ac2787f500e215ce47cf79
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit f706b2aaf9b288c582611191a1841b58feaeb741
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit 18e984cfe173390843c73048a931baa17800f918
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Add download configuration for dataset

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* add datasets

* Update gpt_eval_model_name in mathvista.yaml

* Merge commit '0d620f98b49f8204d02633f209eedd5d8b7a1f7c'

* Update pyproject.toml with dependencies and URLs

* Squashed commit of the following:

commit 8b600f55b6cf5627504c407871539db59f6085a3
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Sat Jan 27 13:56:37 2024 +0800

    Dev/add chartqa and ai2d (#23)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * Add 'submissions/' directory to .gitignore

    * Add Python setup and Black version installation workflow
    Refactor ContextSampler class in samplers.py
    Remove unnecessary line in DecontaminationFilter class
    Update dependencies in pyproject.toml

    * Refactor code in ContextSampler class

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

* Refactor image processing and submission file path

* Refactor directory creation logic in cli_evaluate_single function

* Update dataset path and test split in vqav2.yaml

* Remove "total" column from cap_details_columns DataFrame

* Add retry logic for dataset download

* Add 'tenacity' to dependencies in pyproject.toml

* Refactor code in ContextSampler class

* Update Black version and configuration, and improve code readability in ContextSampler

* Update Black version and line length

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
Co-authored-by: Fanyi Pu <FPU001@e.ntu.edu.sg>

* vqav2 (#25)

* Update tqdm progress bar position

* Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

* Squashed commit of the following:

commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

commit 842fbc6f2da7d9a118adf9ec27c3d8542d74168e
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4bf0504fabc3b62f356c467b2fd1119083d27313
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

    * Update dataset paths and improve user prompts

commit 767f7e2cae60cf67ec5878234d84321395a3ed15
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* remove useless output file

* Update dataset path in vqav2.yaml

* Squashed commit of the following:

commit eeb2b9827502f044ef67d8440f53124baf219ba3
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:56:45 2024 +0800

    Black lint

commit 1ce9f0b37e4bc5e6ff5fbfcd23fd339eb14974ae
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:47 2024 +0800

    Solve doc_iterator_for_counting crashing issue

commit e12b3bb41ed4f51540cfac84e5e96d15777540c4
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:13 2024 +0800

    Exclude train in refcoco/+/g config

commit 42c56f82bc4ccae12e19e76d09d7e525ca9ef2f4
Merge: 6a1ae69 697a438
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:13 2024 +0000

    Merge branch 'dev/bli_add_datasets' of https://github.com/EvolvingLMMs-Lab/lmms-eval into dev/bli_add_datasets

commit aed08303fe87808986d206540a0c0ee6d8764988
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:06 2024 +0000

    Fix file path and raise error if config file does not exist

commit a105386613c443d9e740c89725cbd1281bbdfef6
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 00:47:24 2024 +0800

    Fix tasks issue for nocaps, refcoco/+/g

commit 21c8119e377760f44c769bed2528d863a8f4333b
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 10:09:43 2024 +0000

    Remove unused files and update task configuration

commit 0ccb2629c2aacdb297b7cf0c9c2bcfa386bb7582
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:43:56 2024 +0000

    Add submission file for coco, flickr30k, nocaps, and textcaps tasks

commit 5365e13e93c702a1e0e259ee6a08d6a427d72470
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:32:54 2024 +0000

    Refactor get_task_dict function to handle nested groups

commit 6773348c807bcfa1b09ceffc90c75e15cad908f7
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:13:46 2024 +0000

    Fix bug in login functionality

    Refactor code for better performance

    Add new feature for user authentication

    Update UI layout for improved user experience

    Fix typo in variable name

    Optimize database queries for faster response time

    Add error handling for edge cases

    Update dependencies to latest versions

    Remove unused code

    Improve code readability and maintainability

commit 31140f9c87dea89ca94c94bc850e3a8d43e5f8b4
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 17:07:20 2024 +0800

    Fix cli itself can not run with config file

commit df1bad47f6ed13f94848d2bee29b28e00c2384b2
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:09:04 2024 +0000

    Squashed commit of the following:

    commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
    Author: Zhang Peiyuan <a1286225768@gmail.com>
    Date:   Thu Jan 25 17:08:25 2024 +0800

        add model specific prompt and gen kwargs in sqa (#19)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

commit 06383aa4a5ff59db52fc8d584f3086efd88b7e74
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:02:57 2024 +0000

    Squashed commit of the following:

    commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit 2626383d99b5eac59d531ca0f293df960570c524
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 520c7a2cafe60810aca79df814ce6829d4576032
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit 7a71fd6022ee5985100dda38b94956595cec77a5
Merge: 22c3adf 1d3fdd4
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:43:15 2024 +0000

    Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e' into dev/bli_add_datasets

commit 6870cba13cb54976480c1d5e8d97602c246f881b
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:52 2024 +0000

    Squashed commit of the following:

    commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit 2626383d99b5eac59d531ca0f293df960570c524
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 842fbc6f2da7d9a118adf9ec27c3d8542d74168e
    Author: kcz358 <92624596+kcz358@users.noreply.github.com>
    Date:   Thu Jan 25 09:47:31 2024 +0800

        [Dataset] Add flickr30k (#18)

        * Add flickr30k support

        * Black lint

        * Align prompt with NoCaps

    commit 4bf0504fabc3b62f356c467b2fd1119083d27313
    Author: Li Bo <drluodian@gmail.com>
    Date:   Wed Jan 24 22:10:14 2024 +0800

        [Datasets] modify NoCaps data path and prompts (#17)

        * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

        * Update dataset paths and improve user prompts

    commit 520c7a2cafe60810aca79df814ce6829d4576032
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit b40d522b6bf483ebdfbf5facd4573de0cf8a93f6
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:11 2024 +0000

    Add coco_val and coco_test tasks to coco.yaml

commit 5bf643f73d06f1e540897b753450352bb92fd9ec
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 04:58:28 2024 +0000

    Update dataset_path in flickr30k.yaml

commit 95f110f0eef5196205bc501367e3642c57cc7a17
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:12:25 2024 +0000

    Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

commit c844ae49b18c1334711832208b0359c9439fe1c0
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:10:18 2024 +0000

    Add submission folder and update file paths for storing prediction results

commit 842fbc6f2da7d9a118adf9ec27c3d8542d74168e
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4bf0504fabc3b62f356c467b2fd1119083d27313
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

    * Update dataset paths and improve user prompts

commit f0446227f0dd93651e9d6c06254bbf5212ede2dd
Merge: c6370bf 51f2eaa
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:07 2024 +0800

    Merge branch 'main' into dev/bli_add_datasets

commit 1e1f6cfccba758dc606fa4217102518fab73c936
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 14:08:06 2024 +0000

    Update dataset paths and improve user prompts

commit 966933754b9e5179995b3ab41d746603e13e75c6
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 11:52:33 2024 +0000

    Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

commit 767f7e2cae60cf67ec5878234d84321395a3ed15
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* Fix bug in login functionality

* create vqav2_val

* Update vqav2_test.yaml

* Update vqav2_test.yaml

* Update vqav2_val.yaml

---------

Co-authored-by: Li Bo <drluodian@gmail.com>

* vqav2 (#25)

* Update tqdm progress bar position

* Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384'

* Squashed commit of the following:

commit 18e984cfe173390843c73048a931baa17800f918
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

commit ecb47d73d6e000b472be6c5c0cdc9413c7734384
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit dc23f4b42b1dd60b41904d7ddbee1412d6851077
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

    * Update dataset paths and improve user prompts

commit 95ef3ea519cbd772924f9a6afa5394979eb00432
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* remove useless output file

* Update dataset path in vqav2.yaml

* Squashed commit of the following:

commit 75bb7043ea5a533ab6351fc0f5ab055e86106423
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:56:45 2024 +0800

    Black lint

commit 6635a8aa34cfbd3c7a4afb6fcd214a7283ce01cb
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:47 2024 +0800

    Solve doc_iterator_for_counting crashing issue

commit 080f42b88ea8acacd527b8d67b84ba1d7d135b03
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:13 2024 +0800

    Exclude train in refcoco/+/g config

commit 4da84069c08c95e49e8ab0e64a1e103ff7ac8730
Merge: 6a1ae69 697a438
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:13 2024 +0000

    Merge branch 'dev/bli_add_datasets' of https://github.com/EvolvingLMMs-Lab/lmms-eval into dev/bli_add_datasets

commit 6a1ae69923d79ae32a001edac38206b605274ec3
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:06 2024 +0000

    Fix file path and raise error if config file does not exist

commit 697a4387827ceeec3e393237dd1baa217c714c88
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 00:47:24 2024 +0800

    Fix tasks issue for nocaps, refcoco/+/g

commit 47e40437126d39a5f062c9a33b4de426c1a29804
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 10:09:43 2024 +0000

    Remove unused files and update task configuration

commit 9976eb8e9ed03c8613725fdbd822ef5d8cf70e47
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:43:56 2024 +0000

    Add submission file for coco, flickr30k, nocaps, and textcaps tasks

commit 95f97a69faa6129676e89eee14960fcfe2076b7c
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:32:54 2024 +0000

    Refactor get_task_dict function to handle nested groups

commit 3b79ee842b2488714baf92ab34528ef77989d392
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:13:46 2024 +0000

    Fix bug in login functionality

    Refactor code for better performance

    Add new feature for user authentication

    Update UI layout for improved user experience

    Fix typo in variable name

    Optimize database queries for faster response time

    Add error handling for edge cases

    Update dependencies to latest versions

    Remove unused code

    Improve code readability and maintainability

commit f5c353f2ce93a2d96add4312b695b57432f68cbb
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 17:07:20 2024 +0800

    Fix cli itself can not run with config file

commit 9a68fec37be74cfe8d4a73390bc83edee147ae24
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:09:04 2024 +0000

    Squashed commit of the following:

    commit 18e984cfe173390843c73048a931baa17800f918
    Author: Zhang Peiyuan <a1286225768@gmail.com>
    Date:   Thu Jan 25 17:08:25 2024 +0800

        add model specific prompt and gen kwargs in sqa (#19)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

commit 93f847c5851fd246716367935d6b807b17d53949
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:02:57 2024 +0000

    Squashed commit of the following:

    commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit 63739fc6fa0a462d807ae81de0db0173102de584
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit edcc752f97ea3845cefad56624e5d2855066f680
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 5f55126484a7c9325db586d26cf2052538222804
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit fa4ad4404e26d8924f55208746dbb9143b464011
Merge: 22c3adf 1d3fdd4
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:43:15 2024 +0000

    Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384' into dev/bli_add_datasets

commit 22c3adfd0645acc23b6d7c06b487f4ffd47666c4
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:52 2024 +0000

    Squashed commit of the following:

    commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit 63739fc6fa0a462d807ae81de0db0173102de584
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit edcc752f97ea3845cefad56624e5d2855066f680
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit ecb47d73d6e000b472be6c5c0cdc9413c7734384
    Author: kcz358 <92624596+kcz358@users.noreply.github.com>
    Date:   Thu Jan 25 09:47:31 2024 +0800

        [Dataset] Add flickr30k (#18)

        * Add flickr30k support

        * Black lint

        * Align prompt with NoCaps

    commit dc23f4b42b1dd60b41904d7ddbee1412d6851077
    Author: Li Bo <drluodian@gmail.com>
    Date:   Wed Jan 24 22:10:14 2024 +0800

        [Datasets] modify NoCaps data path and prompts (#17)

        * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

        * Update dataset paths and improve user prompts

    commit 5f55126484a7c9325db586d26cf2052538222804
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit 4c712336b6f7438e717a865910bb241e413a4688
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:11 2024 +0000

    Add coco_val and coco_test tasks to coco.yaml

commit b5547126c855927fd4dc8384211e4aceee40870f
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 04:58:28 2024 +0000

    Update dataset_path in flickr30k.yaml

commit f786f61e2559f082072f21aa9030e2080ddaf809
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:12:25 2024 +0000

    Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384'

commit 796a011000e0df90f66f8e80cb34dc2318ae9ac8
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:10:18 2024 +0000

    Add submission folder and update file paths for storing prediction results

commit ecb47d73d6e000b472be6c5c0cdc9413c7734384
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit dc23f4b42b1dd60b41904d7ddbee1412d6851077
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

    * Update dataset paths and improve user prompts

commit 118744c63eb2d9724571d85fbbd85fcc9ad05b59
Merge: c6370bf 51f2eaa
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:07 2024 +0800

    Merge branch 'main' into dev/bli_add_datasets

commit c6370bff65903681f00cf3d07111d8e15a57b619
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 14:08:06 2024 +0000

    Update dataset paths and improve user prompts

commit 810daf458fa94cb3ec2b4a6cc5ecb1e656a24002
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 11:52:33 2024 +0000

    Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

commit 95ef3ea519cbd772924f9a6afa5394979eb00432
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* Fix bug in login functionality

* create vqav2_val

* Update vqav2_test.yaml

* Update vqav2_test.yaml

* Update vqav2_val.yaml

---------

Co-authored-by: Li Bo <drluodian@gmail.com>

* vizwiz dataset (#24)

* Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 2626383d99b5eac59d531ca0f293df960570c524
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa
…
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
…del Specific Prompt. (EvolvingLMMs-Lab#20)

* Merge commit 'a0b87f52d0c7cde3c320aeac77eb11165e5bb3ef'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit '4d11dcea8db1a7e4b7347f3c9880788e8cde5d9f'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 4d11dce
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 4d11dce
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 7c68ea1
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit 'a0b87f52d0c7cde3c320aeac77eb11165e5bb3ef'

    * Update dataset paths and improve user prompts

commit 2c63fe7f7b6313ce772edeb41974ba0b08b8c469
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit c524ca948439157c24faad9b2fc41c7c139e0ed1
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 4d11dce
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 2c63fe7f7b6313ce772edeb41974ba0b08b8c469
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit c524ca948439157c24faad9b2fc41c7c139e0ed1
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 95460de
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit aaf199c777fe7b81e1ad39bd72cf2cd1daf30d69
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit 07b5317f2d9f85465b35dcb2e11cf0d3a51aeb2a
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 6126fe6d8bdf09825855236377cb78b5e4b242ed
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit 95460de
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit c9f49774bfa0f505fb266871f3e56ae5a397a97b
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit d2a852842282e211ca885180db1aba4b1d1f8c2b
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit 8ef634ccbe2bd5f1159674f1ce70349d7adf935f
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit f49f4961d921b7c8196c1484418ec1673e5e4b74
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit 368690aad385c5e1972fe5394b94a8eb1a47efca
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit 47463754525984a17f790c5dace6ff05b1ce72f7
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit 95460de
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Add download configuration for dataset

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* add datasets

* Update gpt_eval_model_name in mathvista.yaml

* Merge commit '0d620f98b49f8204d02633f209eedd5d8b7a1f7c'

* Update pyproject.toml with dependencies and URLs

* Squashed commit of the following:

commit 60c1d7c
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Sat Jan 27 13:56:37 2024 +0800

    Dev/add chartqa and ai2d (EvolvingLMMs-Lab#23)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * Add 'submissions/' directory to .gitignore

    * Add Python setup and Black version installation workflow
    Refactor ContextSampler class in samplers.py
    Remove unnecessary line in DecontaminationFilter class
    Update dependencies in pyproject.toml

    * Refactor code in ContextSampler class

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

* Refactor image processing and submission file path

* Refactor directory creation logic in cli_evaluate_single function

* Update dataset path and test split in vqav2.yaml

* Remove "total" column from cap_details_columns DataFrame

* Add retry logic for dataset download

* Add 'tenacity' to dependencies in pyproject.toml

* Refactor code in ContextSampler class

* Update Black version and configuration, and improve code readability in ContextSampler

* Update Black version and line length

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
Co-authored-by: Fanyi Pu <FPU001@e.ntu.edu.sg>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
…del Specific Prompt. (EvolvingLMMs-Lab#20)

* Merge commit 'e546b08ca8286fe2e4d0943ad9b41667d275f65a'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit 'f9c9014ba3566cb1bf1f19bf0d85c6e54ce7c8b4'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit a9bdc9b952df662cd7156ccc63af31ae0a83d2ff
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 7313f07606ec94f555d50d4523adcb2c1714922e
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 69f7f0be0eaa855c6c46e7c748a7ac69a04606e8
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 9173602a072c669f3348a58b715c77cfef4f0fbf
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 9386d0011c4d6ed7190373d0951d903c7548ccb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 77079bc826943e187247863d5473237de05b3cf2
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit f9c9014
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4a97197
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit 'e546b08ca8286fe2e4d0943ad9b41667d275f65a'

    * Update dataset paths and improve user prompts

commit 1a284e6a412da3cc503297f33417dad19dd59aee
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit d1c04e8c8e509a375c117020b3c241cc736f9365
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit a9bdc9b952df662cd7156ccc63af31ae0a83d2ff
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 7313f07606ec94f555d50d4523adcb2c1714922e
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 69f7f0be0eaa855c6c46e7c748a7ac69a04606e8
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 9173602a072c669f3348a58b715c77cfef4f0fbf
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 9386d0011c4d6ed7190373d0951d903c7548ccb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 77079bc826943e187247863d5473237de05b3cf2
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 1a284e6a412da3cc503297f33417dad19dd59aee
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit d1c04e8c8e509a375c117020b3c241cc736f9365
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 0e74884
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit a2cc9303dc72e4d53983bb56e54a32e977c3e270
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit 35e87e7c7a480d005abf607c2527a35457d92311
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 89755323596b85208ed33aa88c296604a39af6eb
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit 0e74884
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 0b0d30dfb247c5f0b7b68398b9e9fcde74cf7fa2
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit e273f9cbd91540df86bdbc652bff88a847bd0d2d
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit e84126aaaf8a07bd371a0571a914ccbcd3697f20
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 110deab53dc1a2fd349b1872cd261b69074c5fa8
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit 0fa3e0c40075997ea80ed976bdee9615f17d3ece
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit 2aaca579120def99860f90054233f3358950fa66
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit 0e74884
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Add download configuration for dataset

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* add datasets

* Update gpt_eval_model_name in mathvista.yaml

* Merge commit '97ff1387a5d851d5e34dd6988fb4567f87e0ce7e'

* Update pyproject.toml with dependencies and URLs

* Squashed commit of the following:

commit 99f5333
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Sat Jan 27 13:56:37 2024 +0800

    Dev/add chartqa and ai2d (EvolvingLMMs-Lab#23)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * Add 'submissions/' directory to .gitignore

    * Add Python setup and Black version installation workflow
    Refactor ContextSampler class in samplers.py
    Remove unnecessary line in DecontaminationFilter class
    Update dependencies in pyproject.toml

    * Refactor code in ContextSampler class

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

* Refactor image processing and submission file path

* Refactor directory creation logic in cli_evaluate_single function

* Update dataset path and test split in vqav2.yaml

* Remove "total" column from cap_details_columns DataFrame

* Add retry logic for dataset download

* Add 'tenacity' to dependencies in pyproject.toml

* Refactor code in ContextSampler class

* Update Black version and configuration, and improve code readability in ContextSampler

* Update Black version and line length

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
Co-authored-by: Fanyi Pu <FPU001@e.ntu.edu.sg>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
…del Specific Prompt. (EvolvingLMMs-Lab#20)

* Merge commit '84cec070862dc1806761d9f0ee5f1df3b4c8ac0c'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit '8709dc0660676131a2d84126b6cf5ea2ee873c7f'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 8709dc0
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4e27457
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit '84cec070862dc1806761d9f0ee5f1df3b4c8ac0c'

    * Update dataset paths and improve user prompts

commit 7d5058337d3de3cd4f0e85368e3dd463f34e703c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 73918654650daa0dad965d1b786d53e7c3585010
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 7d5058337d3de3cd4f0e85368e3dd463f34e703c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 73918654650daa0dad965d1b786d53e7c3585010
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 7021e8e
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit 35c3c7098e489ddc552778ea801a6acb6a25a9d9
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit 25d9de0b0ea4418e4b1b6f74bdb0dd4c835f66a9
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit aad562494c54d6ddd8cc9b9558a2a300e65f2ea2
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit 7021e8e
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 40d1888f2e83dadac572c08b7e1f0ae6e2b4d504
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit 02b00db5c3c2dce5ab4c2db6a3eacc7d0b735942
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit f35878778fc0179381b8f3d61d222000b1773774
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 64fb8196c4d9a943fa11a1d0b0fd2a065ed37847
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit f79ece372f140427c9461aa652fe1a9e8a312b3d
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit 028007a0352365dd42a968df6000eb66c9d30e2b
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit 7021e8e
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Add download configuration for dataset

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* add datasets

* Update gpt_eval_model_name in mathvista.yaml

* Merge commit '0d620f98b49f8204d02633f209eedd5d8b7a1f7c'

* Update pyproject.toml with dependencies and URLs

* Squashed commit of the following:

commit 6904a35
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Sat Jan 27 13:56:37 2024 +0800

    Dev/add chartqa and ai2d (EvolvingLMMs-Lab#23)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * Add 'submissions/' directory to .gitignore

    * Add Python setup and Black version installation workflow
    Refactor ContextSampler class in samplers.py
    Remove unnecessary line in DecontaminationFilter class
    Update dependencies in pyproject.toml

    * Refactor code in ContextSampler class

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

* Refactor image processing and submission file path

* Refactor directory creation logic in cli_evaluate_single function

* Update dataset path and test split in vqav2.yaml

* Remove "total" column from cap_details_columns DataFrame

* Add retry logic for dataset download

* Add 'tenacity' to dependencies in pyproject.toml

* Refactor code in ContextSampler class

* Update Black version and configuration, and improve code readability in ContextSampler

* Update Black version and line length

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
Co-authored-by: Fanyi Pu <FPU001@e.ntu.edu.sg>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
…del Specific Prompt. (EvolvingLMMs-Lab#20)

* Merge commit '1e0514f92df2bbcd3d1c1fc86e3212c5fed93eaf'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit '2037acaebc414280bd85e31b30ef9d2e671b3a19'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit bf49735f01e8a523d01acadba47a410b1fa46434
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 8a7901e371f8f1e1c47442609cf5d007a5aee3df
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit fcd53e6e5a1a7b17e7a69c08eb306dd8ad3435c6
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit cbf0704d7b754b0d233f1643f3c3181fea8d02db
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 77cc77fe7c49d65b3275c333bb1ce93798d46994
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 100acee4869445bfa0a00aebdc1d36272f2af7ed
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 2037aca
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 5df364f
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit '1e0514f92df2bbcd3d1c1fc86e3212c5fed93eaf'

    * Update dataset paths and improve user prompts

commit fc6d5dd1b7e142e0336c2099845cd2b89558a77b
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 4d35cfef00c7bbe2d51d7e72b4df60fc30e0cea1
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit bf49735f01e8a523d01acadba47a410b1fa46434
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 8a7901e371f8f1e1c47442609cf5d007a5aee3df
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit fcd53e6e5a1a7b17e7a69c08eb306dd8ad3435c6
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit cbf0704d7b754b0d233f1643f3c3181fea8d02db
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 77cc77fe7c49d65b3275c333bb1ce93798d46994
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 100acee4869445bfa0a00aebdc1d36272f2af7ed
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit fc6d5dd1b7e142e0336c2099845cd2b89558a77b
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 4d35cfef00c7bbe2d51d7e72b4df60fc30e0cea1
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 15a5c86
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit a2cc9303dc72e4d53983bb56e54a32e977c3e270
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit 35e87e7c7a480d005abf607c2527a35457d92311
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 89755323596b85208ed33aa88c296604a39af6eb
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit 15a5c86
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 0b0d30dfb247c5f0b7b68398b9e9fcde74cf7fa2
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit e273f9cbd91540df86bdbc652bff88a847bd0d2d
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit e84126aaaf8a07bd371a0571a914ccbcd3697f20
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 110deab53dc1a2fd349b1872cd261b69074c5fa8
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit 0fa3e0c40075997ea80ed976bdee9615f17d3ece
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit 2aaca579120def99860f90054233f3358950fa66
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit 15a5c86
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Add download configuration for dataset

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* add datasets

* Update gpt_eval_model_name in mathvista.yaml

* Merge commit '6675f7f78dab8240ff74e2b35530ad5d500dcead'

* Update pyproject.toml with dependencies and URLs

* Squashed commit of the following:

commit 07298ce
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Sat Jan 27 13:56:37 2024 +0800

    Dev/add chartqa and ai2d (EvolvingLMMs-Lab#23)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * Add 'submissions/' directory to .gitignore

    * Add Python setup and Black version installation workflow
    Refactor ContextSampler class in samplers.py
    Remove unnecessary line in DecontaminationFilter class
    Update dependencies in pyproject.toml

    * Refactor code in ContextSampler class

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

* Refactor image processing and submission file path

* Refactor directory creation logic in cli_evaluate_single function

* Update dataset path and test split in vqav2.yaml

* Remove "total" column from cap_details_columns DataFrame

* Add retry logic for dataset download

* Add 'tenacity' to dependencies in pyproject.toml

* Refactor code in ContextSampler class

* Update Black version and configuration, and improve code readability in ContextSampler

* Update Black version and line length

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
Co-authored-by: Fanyi Pu <FPU001@e.ntu.edu.sg>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
…del Specific Prompt. (EvolvingLMMs-Lab#20)

* Merge commit '340c4501058e13bc64aad611c8bbb4d0059fc545'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit 'b2c71248314fc8f8461222e594c7ab046f5383f5'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit b2c7124
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit d9c5827
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit '340c4501058e13bc64aad611c8bbb4d0059fc545'

    * Update dataset paths and improve user prompts

commit facd3d87fef5f4eb82dbe3b236a6b199dc87863e
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 3380863c2ca0f3b98d74f94c9e72460d28d34acd
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit facd3d87fef5f4eb82dbe3b236a6b199dc87863e
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 3380863c2ca0f3b98d74f94c9e72460d28d34acd
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 8dce2b0
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit 6f66c1130070307ba51eae79f54e197f0053266b
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit a6d360d7b1092d5656e4b4ad7d8964f44ee0a3dc
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 7ed11f762e3af8b9a2261793c5bbc9c3ebc2c512
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit 8dce2b0
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 963fd932338aae1dee007bbb574daec162cb58bb
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit 1481d73aef646233dce05b3b2989a9e8eddcab2b
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit 45a3bf24b4c6e610237e2ef81f1b01cf11ee25d9
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 63080782e2d7544d58c513648dd64647131d6337
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit ef60547ab60a4a5e18de1634c8126ad5cbc1139c
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit 7d2e92c2835f88cd7832ddab0874996b308faa9a
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit 8dce2b0
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Add download configuration for dataset

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* add datasets

* Update gpt_eval_model_name in mathvista.yaml

* Merge commit '0d620f98b49f8204d02633f209eedd5d8b7a1f7c'

* Update pyproject.toml with dependencies and URLs

* Squashed commit of the following:

commit 297f023
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Sat Jan 27 13:56:37 2024 +0800

    Dev/add chartqa and ai2d (EvolvingLMMs-Lab#23)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * Add 'submissions/' directory to .gitignore

    * Add Python setup and Black version installation workflow
    Refactor ContextSampler class in samplers.py
    Remove unnecessary line in DecontaminationFilter class
    Update dependencies in pyproject.toml

    * Refactor code in ContextSampler class

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

* Refactor image processing and submission file path

* Refactor directory creation logic in cli_evaluate_single function

* Update dataset path and test split in vqav2.yaml

* Remove "total" column from cap_details_columns DataFrame

* Add retry logic for dataset download

* Add 'tenacity' to dependencies in pyproject.toml

* Refactor code in ContextSampler class

* Update Black version and configuration, and improve code readability in ContextSampler

* Update Black version and line length

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
Co-authored-by: Fanyi Pu <FPU001@e.ntu.edu.sg>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
…del Specific Prompt. (EvolvingLMMs-Lab#20)

* Merge commit '76c213db0f1495c1ececf0b58678f87cc6144e3c'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit '82108537ee4e3d54d6378fb7faa78199e00a3e8b'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit a9bdc9b952df662cd7156ccc63af31ae0a83d2ff
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 7313f07606ec94f555d50d4523adcb2c1714922e
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 69f7f0be0eaa855c6c46e7c748a7ac69a04606e8
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 9173602a072c669f3348a58b715c77cfef4f0fbf
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 9386d0011c4d6ed7190373d0951d903c7548ccb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 77079bc826943e187247863d5473237de05b3cf2
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 8210853
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 158c42d
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit '76c213db0f1495c1ececf0b58678f87cc6144e3c'

    * Update dataset paths and improve user prompts

commit 1a284e6a412da3cc503297f33417dad19dd59aee
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit d1c04e8c8e509a375c117020b3c241cc736f9365
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit a9bdc9b952df662cd7156ccc63af31ae0a83d2ff
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 7313f07606ec94f555d50d4523adcb2c1714922e
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 69f7f0be0eaa855c6c46e7c748a7ac69a04606e8
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 9173602a072c669f3348a58b715c77cfef4f0fbf
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 9386d0011c4d6ed7190373d0951d903c7548ccb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 77079bc826943e187247863d5473237de05b3cf2
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 1a284e6a412da3cc503297f33417dad19dd59aee
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit d1c04e8c8e509a375c117020b3c241cc736f9365
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 03edad8
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit a2cc9303dc72e4d53983bb56e54a32e977c3e270
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit 35e87e7c7a480d005abf607c2527a35457d92311
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 89755323596b85208ed33aa88c296604a39af6eb
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit 03edad8
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 0b0d30dfb247c5f0b7b68398b9e9fcde74cf7fa2
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit e273f9cbd91540df86bdbc652bff88a847bd0d2d
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit e84126aaaf8a07bd371a0571a914ccbcd3697f20
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 110deab53dc1a2fd349b1872cd261b69074c5fa8
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit 0fa3e0c40075997ea80ed976bdee9615f17d3ece
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit 2aaca579120def99860f90054233f3358950fa66
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit 03edad8
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Add download configuration for dataset

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* add datasets

* Update gpt_eval_model_name in mathvista.yaml

* Merge commit '97ff1387a5d851d5e34dd6988fb4567f87e0ce7e'

* Update pyproject.toml with dependencies and URLs

* Squashed commit of the following:

commit 68fdd79
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Sat Jan 27 13:56:37 2024 +0800

    Dev/add chartqa and ai2d (EvolvingLMMs-Lab#23)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * Add 'submissions/' directory to .gitignore

    * Add Python setup and Black version installation workflow
    Refactor ContextSampler class in samplers.py
    Remove unnecessary line in DecontaminationFilter class
    Update dependencies in pyproject.toml

    * Refactor code in ContextSampler class

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

* Refactor image processing and submission file path

* Refactor directory creation logic in cli_evaluate_single function

* Update dataset path and test split in vqav2.yaml

* Remove "total" column from cap_details_columns DataFrame

* Add retry logic for dataset download

* Add 'tenacity' to dependencies in pyproject.toml

* Refactor code in ContextSampler class

* Update Black version and configuration, and improve code readability in ContextSampler

* Update Black version and line length

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
Co-authored-by: Fanyi Pu <FPU001@e.ntu.edu.sg>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
…del Specific Prompt. (EvolvingLMMs-Lab#20)

* Merge commit 'bee5794a597d8a87794b4bcd9b57a1553efad857'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit '52ee4a18dad22b2399a4248d2aa9204dbfe88624'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 52ee4a1
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 04303b0
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit 'bee5794a597d8a87794b4bcd9b57a1553efad857'

    * Update dataset paths and improve user prompts

commit 7d5058337d3de3cd4f0e85368e3dd463f34e703c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 73918654650daa0dad965d1b786d53e7c3585010
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 7d5058337d3de3cd4f0e85368e3dd463f34e703c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 73918654650daa0dad965d1b786d53e7c3585010
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit f7a7db5
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit 35c3c7098e489ddc552778ea801a6acb6a25a9d9
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit 25d9de0b0ea4418e4b1b6f74bdb0dd4c835f66a9
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit aad562494c54d6ddd8cc9b9558a2a300e65f2ea2
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit f7a7db5
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 40d1888f2e83dadac572c08b7e1f0ae6e2b4d504
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit 02b00db5c3c2dce5ab4c2db6a3eacc7d0b735942
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit f35878778fc0179381b8f3d61d222000b1773774
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 64fb8196c4d9a943fa11a1d0b0fd2a065ed37847
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit f79ece372f140427c9461aa652fe1a9e8a312b3d
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit 028007a0352365dd42a968df6000eb66c9d30e2b
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit f7a7db5
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Add download configuration for dataset

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* add datasets

* Update gpt_eval_model_name in mathvista.yaml

* Merge commit '0d620f98b49f8204d02633f209eedd5d8b7a1f7c'

* Update pyproject.toml with dependencies and URLs

* Squashed commit of the following:

commit 3a3373b
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Sat Jan 27 13:56:37 2024 +0800

    Dev/add chartqa and ai2d (EvolvingLMMs-Lab#23)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * Add 'submissions/' directory to .gitignore

    * Add Python setup and Black version installation workflow
    Refactor ContextSampler class in samplers.py
    Remove unnecessary line in DecontaminationFilter class
    Update dependencies in pyproject.toml

    * Refactor code in ContextSampler class

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

* Refactor image processing and submission file path

* Refactor directory creation logic in cli_evaluate_single function

* Update dataset path and test split in vqav2.yaml

* Remove "total" column from cap_details_columns DataFrame

* Add retry logic for dataset download

* Add 'tenacity' to dependencies in pyproject.toml

* Refactor code in ContextSampler class

* Update Black version and configuration, and improve code readability in ContextSampler

* Update Black version and line length

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
Co-authored-by: Fanyi Pu <FPU001@e.ntu.edu.sg>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
…del Specific Prompt. (EvolvingLMMs-Lab#20)

* Merge commit '95f3d3e116db32b49631f2005c9b2a608f778cc0'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit 'bfdf75d7b67680cdc98fdf3f58458633bb492de6'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 11795cb69caaaceddf6b284f18a386c7787d476d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit fb19895ca28ecf64d2ea5322e5391f7742e540f4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit e02df3b556a9d34d32d8bfa1f99ea992b763bc6f
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 388a23ac4bb47644826869562c70c10b470a1817
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit bcb7df038402c5ef73db230126fcd76795ee69df
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 30056b56be382107f520d5c85b84c3d541d970e9
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit bfdf75d
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit f69268b
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit '95f3d3e116db32b49631f2005c9b2a608f778cc0'

    * Update dataset paths and improve user prompts

commit 53ddf3fb2716fd99b2fa454656312d6fc92227b7
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit d7bbd3b2cbd78fdc3df2137ac0d625b5f5505acc
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 11795cb69caaaceddf6b284f18a386c7787d476d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit fb19895ca28ecf64d2ea5322e5391f7742e540f4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit e02df3b556a9d34d32d8bfa1f99ea992b763bc6f
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 388a23ac4bb47644826869562c70c10b470a1817
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit bcb7df038402c5ef73db230126fcd76795ee69df
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 30056b56be382107f520d5c85b84c3d541d970e9
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 53ddf3fb2716fd99b2fa454656312d6fc92227b7
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit d7bbd3b2cbd78fdc3df2137ac0d625b5f5505acc
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 19db53b
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit a2cc9303dc72e4d53983bb56e54a32e977c3e270
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit 35e87e7c7a480d005abf607c2527a35457d92311
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 89755323596b85208ed33aa88c296604a39af6eb
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit 19db53b
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 0b0d30dfb247c5f0b7b68398b9e9fcde74cf7fa2
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit e273f9cbd91540df86bdbc652bff88a847bd0d2d
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit e84126aaaf8a07bd371a0571a914ccbcd3697f20
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 110deab53dc1a2fd349b1872cd261b69074c5fa8
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit 0fa3e0c40075997ea80ed976bdee9615f17d3ece
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit 2aaca579120def99860f90054233f3358950fa66
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit 19db53b
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Add download configuration for dataset

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* add datasets

* Update gpt_eval_model_name in mathvista.yaml

* Merge commit '899aa01c40d964fdabf024964c7e96fe3663c7d6'

* Update pyproject.toml with dependencies and URLs

* Squashed commit of the following:

commit 94b86aa
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Sat Jan 27 13:56:37 2024 +0800

    Dev/add chartqa and ai2d (EvolvingLMMs-Lab#23)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * Add 'submissions/' directory to .gitignore

    * Add Python setup and Black version installation workflow
    Refactor ContextSampler class in samplers.py
    Remove unnecessary line in DecontaminationFilter class
    Update dependencies in pyproject.toml

    * Refactor code in ContextSampler class

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

* Refactor image processing and submission file path

* Refactor directory creation logic in cli_evaluate_single function

* Update dataset path and test split in vqav2.yaml

* Remove "total" column from cap_details_columns DataFrame

* Add retry logic for dataset download

* Add 'tenacity' to dependencies in pyproject.toml

* Refactor code in ContextSampler class

* Update Black version and configuration, and improve code readability in ContextSampler

* Update Black version and line length

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
Co-authored-by: Fanyi Pu <FPU001@e.ntu.edu.sg>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
* Update tqdm progress bar position

* Merge commit '4d11dcea8db1a7e4b7347f3c9880788e8cde5d9f'

* Squashed commit of the following:

commit 95460de
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

commit 4d11dce
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 7c68ea1
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit 'a0b87f52d0c7cde3c320aeac77eb11165e5bb3ef'

    * Update dataset paths and improve user prompts

commit a0b87f5
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (EvolvingLMMs-Lab#16)

    Update datetime format in get_datetime_str() function

* remove useless output file

* Update dataset path in vqav2.yaml

* Squashed commit of the following:

commit 75bb7043ea5a533ab6351fc0f5ab055e86106423
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:56:45 2024 +0800

    Black lint

commit 6635a8aa34cfbd3c7a4afb6fcd214a7283ce01cb
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:47 2024 +0800

    Solve doc_iterator_for_counting crashing issue

commit 080f42b88ea8acacd527b8d67b84ba1d7d135b03
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:13 2024 +0800

    Exclude train in refcoco/+/g config

commit 4da84069c08c95e49e8ab0e64a1e103ff7ac8730
Merge: 6a1ae69 697a438
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:13 2024 +0000

    Merge branch 'dev/bli_add_datasets' of https://github.com/EvolvingLMMs-Lab/lmms-eval into dev/bli_add_datasets

commit 6a1ae69923d79ae32a001edac38206b605274ec3
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:06 2024 +0000

    Fix file path and raise error if config file does not exist

commit 697a4387827ceeec3e393237dd1baa217c714c88
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 00:47:24 2024 +0800

    Fix tasks issue for nocaps, refcoco/+/g

commit 47e40437126d39a5f062c9a33b4de426c1a29804
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 10:09:43 2024 +0000

    Remove unused files and update task configuration

commit 9976eb8e9ed03c8613725fdbd822ef5d8cf70e47
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:43:56 2024 +0000

    Add submission file for coco, flickr30k, nocaps, and textcaps tasks

commit 95f97a69faa6129676e89eee14960fcfe2076b7c
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:32:54 2024 +0000

    Refactor get_task_dict function to handle nested groups

commit 3b79ee842b2488714baf92ab34528ef77989d392
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:13:46 2024 +0000

    Fix bug in login functionality

    Refactor code for better performance

    Add new feature for user authentication

    Update UI layout for improved user experience

    Fix typo in variable name

    Optimize database queries for faster response time

    Add error handling for edge cases

    Update dependencies to latest versions

    Remove unused code

    Improve code readability and maintainability

commit f5c353f2ce93a2d96add4312b695b57432f68cbb
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 17:07:20 2024 +0800

    Fix cli itself can not run with config file

commit 9a68fec37be74cfe8d4a73390bc83edee147ae24
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:09:04 2024 +0000

    Squashed commit of the following:

    commit 95460de
    Author: Zhang Peiyuan <a1286225768@gmail.com>
    Date:   Thu Jan 25 17:08:25 2024 +0800

        add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

commit 93f847c5851fd246716367935d6b807b17d53949
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:02:57 2024 +0000

    Squashed commit of the following:

    commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit 63739fc6fa0a462d807ae81de0db0173102de584
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit edcc752f97ea3845cefad56624e5d2855066f680
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
    Merge: 7e8b57d 4d11dce
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 2c63fe7f7b6313ce772edeb41974ba0b08b8c469
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit c524ca948439157c24faad9b2fc41c7c139e0ed1
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit fa4ad4404e26d8924f55208746dbb9143b464011
Merge: 22c3adf 4d11dce
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:43:15 2024 +0000

    Merge commit '4d11dcea8db1a7e4b7347f3c9880788e8cde5d9f' into dev/bli_add_datasets

commit 22c3adfd0645acc23b6d7c06b487f4ffd47666c4
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:52 2024 +0000

    Squashed commit of the following:

    commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit 63739fc6fa0a462d807ae81de0db0173102de584
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit edcc752f97ea3845cefad56624e5d2855066f680
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
    Merge: 7e8b57d 4d11dce
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 4d11dce
    Author: kcz358 <92624596+kcz358@users.noreply.github.com>
    Date:   Thu Jan 25 09:47:31 2024 +0800

        [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

        * Add flickr30k support

        * Black lint

        * Align prompt with NoCaps

    commit 7c68ea1
    Author: Li Bo <drluodian@gmail.com>
    Date:   Wed Jan 24 22:10:14 2024 +0800

        [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

        * Merge commit 'a0b87f52d0c7cde3c320aeac77eb11165e5bb3ef'

        * Update dataset paths and improve user prompts

    commit 2c63fe7f7b6313ce772edeb41974ba0b08b8c469
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit c524ca948439157c24faad9b2fc41c7c139e0ed1
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit 4c712336b6f7438e717a865910bb241e413a4688
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:11 2024 +0000

    Add coco_val and coco_test tasks to coco.yaml

commit b5547126c855927fd4dc8384211e4aceee40870f
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 04:58:28 2024 +0000

    Update dataset_path in flickr30k.yaml

commit f786f61e2559f082072f21aa9030e2080ddaf809
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:12:25 2024 +0000

    Merge commit '4d11dcea8db1a7e4b7347f3c9880788e8cde5d9f'

commit 796a011000e0df90f66f8e80cb34dc2318ae9ac8
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:10:18 2024 +0000

    Add submission folder and update file paths for storing prediction results

commit 4d11dce
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 7c68ea1
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit 'a0b87f52d0c7cde3c320aeac77eb11165e5bb3ef'

    * Update dataset paths and improve user prompts

commit 118744c63eb2d9724571d85fbbd85fcc9ad05b59
Merge: c6370bf a0b87f5
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:07 2024 +0800

    Merge branch 'main' into dev/bli_add_datasets

commit c6370bff65903681f00cf3d07111d8e15a57b619
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 14:08:06 2024 +0000

    Update dataset paths and improve user prompts

commit 810daf458fa94cb3ec2b4a6cc5ecb1e656a24002
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 11:52:33 2024 +0000

    Merge commit 'a0b87f52d0c7cde3c320aeac77eb11165e5bb3ef'

commit a0b87f5
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (EvolvingLMMs-Lab#16)

    Update datetime format in get_datetime_str() function

* Fix bug in login functionality

* create vqav2_val

* Update vqav2_test.yaml

* Update vqav2_test.yaml

* Update vqav2_val.yaml

---------

Co-authored-by: Li Bo <drluodian@gmail.com>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
* Update tqdm progress bar position

* Merge commit 'f9c9014ba3566cb1bf1f19bf0d85c6e54ce7c8b4'

* Squashed commit of the following:

commit 0e74884
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

commit f9c9014
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4a97197
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit 'e546b08ca8286fe2e4d0943ad9b41667d275f65a'

    * Update dataset paths and improve user prompts

commit e546b08
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (EvolvingLMMs-Lab#16)

    Update datetime format in get_datetime_str() function

* remove useless output file

* Update dataset path in vqav2.yaml

* Squashed commit of the following:

commit 122f420e8450d70eeee97d0e33d30772f781358d
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:56:45 2024 +0800

    Black lint

commit 363c608444b6df57d51e53f2adb8d8cbfeda0852
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:47 2024 +0800

    Solve doc_iterator_for_counting crashing issue

commit 87d7ee9a776438ad39d4f275f6dc589433f30931
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:13 2024 +0800

    Exclude train in refcoco/+/g config

commit d5571773a0d095a288be47883ffdf53f07a077ee
Merge: 6a1ae69 697a438
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:13 2024 +0000

    Merge branch 'dev/bli_add_datasets' of https://github.com/EvolvingLMMs-Lab/lmms-eval into dev/bli_add_datasets

commit e8710b2ab15bf0bfdb30a60fbd18dcec404bd2ae
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:06 2024 +0000

    Fix file path and raise error if config file does not exist

commit 3f3f7481a933a368b5f1b8f267f8003ea0ef82f4
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 00:47:24 2024 +0800

    Fix tasks issue for nocaps, refcoco/+/g

commit c0dcef296c7076d5ad9992c1afef229e447b9851
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 10:09:43 2024 +0000

    Remove unused files and update task configuration

commit 59e99665b53fb0d1b95f59ae7c2bfffdb1f6d93b
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:43:56 2024 +0000

    Add submission file for coco, flickr30k, nocaps, and textcaps tasks

commit 68364a32bd346514262a957e89899a4c1c057bf9
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:32:54 2024 +0000

    Refactor get_task_dict function to handle nested groups

commit 18f51856fef5fd1b62b1189068c9837fb67195e3
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:13:46 2024 +0000

    Fix bug in login functionality

    Refactor code for better performance

    Add new feature for user authentication

    Update UI layout for improved user experience

    Fix typo in variable name

    Optimize database queries for faster response time

    Add error handling for edge cases

    Update dependencies to latest versions

    Remove unused code

    Improve code readability and maintainability

commit 927d9e80856e6b14fd81ddac273ba2468dddc076
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 17:07:20 2024 +0800

    Fix cli itself can not run with config file

commit 589b3a2a124c241441b3180a67bab57412bbe5ef
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:09:04 2024 +0000

    Squashed commit of the following:

    commit 0e74884
    Author: Zhang Peiyuan <a1286225768@gmail.com>
    Date:   Thu Jan 25 17:08:25 2024 +0800

        add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

commit 4c9c7db3ba26973d130a111506b4d5d77ab00c95
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:02:57 2024 +0000

    Squashed commit of the following:

    commit a9bdc9b952df662cd7156ccc63af31ae0a83d2ff
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 7313f07606ec94f555d50d4523adcb2c1714922e
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit 69f7f0be0eaa855c6c46e7c748a7ac69a04606e8
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit 9173602a072c669f3348a58b715c77cfef4f0fbf
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 9386d0011c4d6ed7190373d0951d903c7548ccb5
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 77079bc826943e187247863d5473237de05b3cf2
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 1a284e6a412da3cc503297f33417dad19dd59aee
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit d1c04e8c8e509a375c117020b3c241cc736f9365
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit 4f7ad7f56de08cddd2b3af64635d0d3a2c37ddb7
Merge: 22c3adf 1d3fdd4
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:43:15 2024 +0000

    Merge commit 'f9c9014ba3566cb1bf1f19bf0d85c6e54ce7c8b4' into dev/bli_add_datasets

commit e7cd3c23c345d1ed54e9085ac0cf28006489c434
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:52 2024 +0000

    Squashed commit of the following:

    commit a9bdc9b952df662cd7156ccc63af31ae0a83d2ff
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 7313f07606ec94f555d50d4523adcb2c1714922e
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit 69f7f0be0eaa855c6c46e7c748a7ac69a04606e8
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit 9173602a072c669f3348a58b715c77cfef4f0fbf
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 9386d0011c4d6ed7190373d0951d903c7548ccb5
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 77079bc826943e187247863d5473237de05b3cf2
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit f9c9014
    Author: kcz358 <92624596+kcz358@users.noreply.github.com>
    Date:   Thu Jan 25 09:47:31 2024 +0800

        [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

        * Add flickr30k support

        * Black lint

        * Align prompt with NoCaps

    commit 4a97197
    Author: Li Bo <drluodian@gmail.com>
    Date:   Wed Jan 24 22:10:14 2024 +0800

        [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

        * Merge commit 'e546b08ca8286fe2e4d0943ad9b41667d275f65a'

        * Update dataset paths and improve user prompts

    commit 1a284e6a412da3cc503297f33417dad19dd59aee
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit d1c04e8c8e509a375c117020b3c241cc736f9365
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit 33b1143ba4f86461cc37c5b4f86c3a20523768e5
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:11 2024 +0000

    Add coco_val and coco_test tasks to coco.yaml

commit b163d1076b6df227914352cb7a23e5cbc282c683
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 04:58:28 2024 +0000

    Update dataset_path in flickr30k.yaml

commit 44df7c8fa06090b48a47c5c87f988d2e14c663f9
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:12:25 2024 +0000

    Merge commit 'f9c9014ba3566cb1bf1f19bf0d85c6e54ce7c8b4'

commit 96626ddda0df111ef5498f294e58ed01b51bdbbd
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:10:18 2024 +0000

    Add submission folder and update file paths for storing prediction results

commit f9c9014
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4a97197
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit 'e546b08ca8286fe2e4d0943ad9b41667d275f65a'

    * Update dataset paths and improve user prompts

commit 739dc3f823ab434707c23160c1bf51712ccbdc43
Merge: c6370bf 51f2eaa
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:07 2024 +0800

    Merge branch 'main' into dev/bli_add_datasets

commit 7ae42b4e7f1895429ffaa4ffb2d57f6aab2a470c
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 14:08:06 2024 +0000

    Update dataset paths and improve user prompts

commit 55d8c58b5446e432de7d397eb251028608d08edd
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 11:52:33 2024 +0000

    Merge commit 'e546b08ca8286fe2e4d0943ad9b41667d275f65a'

commit e546b08
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (EvolvingLMMs-Lab#16)

    Update datetime format in get_datetime_str() function

* Fix bug in login functionality

* create vqav2_val

* Update vqav2_test.yaml

* Update vqav2_test.yaml

* Update vqav2_val.yaml

---------

Co-authored-by: Li Bo <drluodian@gmail.com>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
* Update tqdm progress bar position

* Merge commit '8709dc0660676131a2d84126b6cf5ea2ee873c7f'

* Squashed commit of the following:

commit 7021e8e
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

commit 8709dc0
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4e27457
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit '84cec070862dc1806761d9f0ee5f1df3b4c8ac0c'

    * Update dataset paths and improve user prompts

commit 84cec07
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (EvolvingLMMs-Lab#16)

    Update datetime format in get_datetime_str() function

* remove useless output file

* Update dataset path in vqav2.yaml

* Squashed commit of the following:

commit 75bb7043ea5a533ab6351fc0f5ab055e86106423
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:56:45 2024 +0800

    Black lint

commit 6635a8aa34cfbd3c7a4afb6fcd214a7283ce01cb
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:47 2024 +0800

    Solve doc_iterator_for_counting crashing issue

commit 080f42b88ea8acacd527b8d67b84ba1d7d135b03
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:13 2024 +0800

    Exclude train in refcoco/+/g config

commit 4da84069c08c95e49e8ab0e64a1e103ff7ac8730
Merge: 6a1ae69 697a438
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:13 2024 +0000

    Merge branch 'dev/bli_add_datasets' of https://github.com/EvolvingLMMs-Lab/lmms-eval into dev/bli_add_datasets

commit 6a1ae69923d79ae32a001edac38206b605274ec3
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:06 2024 +0000

    Fix file path and raise error if config file does not exist

commit 697a4387827ceeec3e393237dd1baa217c714c88
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 00:47:24 2024 +0800

    Fix tasks issue for nocaps, refcoco/+/g

commit 47e40437126d39a5f062c9a33b4de426c1a29804
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 10:09:43 2024 +0000

    Remove unused files and update task configuration

commit 9976eb8e9ed03c8613725fdbd822ef5d8cf70e47
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:43:56 2024 +0000

    Add submission file for coco, flickr30k, nocaps, and textcaps tasks

commit 95f97a69faa6129676e89eee14960fcfe2076b7c
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:32:54 2024 +0000

    Refactor get_task_dict function to handle nested groups

commit 3b79ee842b2488714baf92ab34528ef77989d392
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:13:46 2024 +0000

    Fix bug in login functionality

    Refactor code for better performance

    Add new feature for user authentication

    Update UI layout for improved user experience

    Fix typo in variable name

    Optimize database queries for faster response time

    Add error handling for edge cases

    Update dependencies to latest versions

    Remove unused code

    Improve code readability and maintainability

commit f5c353f2ce93a2d96add4312b695b57432f68cbb
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 17:07:20 2024 +0800

    Fix cli itself can not run with config file

commit 9a68fec37be74cfe8d4a73390bc83edee147ae24
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:09:04 2024 +0000

    Squashed commit of the following:

    commit 7021e8e
    Author: Zhang Peiyuan <a1286225768@gmail.com>
    Date:   Thu Jan 25 17:08:25 2024 +0800

        add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

commit 93f847c5851fd246716367935d6b807b17d53949
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:02:57 2024 +0000

    Squashed commit of the following:

    commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit 63739fc6fa0a462d807ae81de0db0173102de584
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit edcc752f97ea3845cefad56624e5d2855066f680
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 7d5058337d3de3cd4f0e85368e3dd463f34e703c
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit 73918654650daa0dad965d1b786d53e7c3585010
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit fa4ad4404e26d8924f55208746dbb9143b464011
Merge: 22c3adf 1d3fdd4
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:43:15 2024 +0000

    Merge commit '8709dc0660676131a2d84126b6cf5ea2ee873c7f' into dev/bli_add_datasets

commit 22c3adfd0645acc23b6d7c06b487f4ffd47666c4
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:52 2024 +0000

    Squashed commit of the following:

    commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit 63739fc6fa0a462d807ae81de0db0173102de584
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit edcc752f97ea3845cefad56624e5d2855066f680
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 8709dc0
    Author: kcz358 <92624596+kcz358@users.noreply.github.com>
    Date:   Thu Jan 25 09:47:31 2024 +0800

        [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

        * Add flickr30k support

        * Black lint

        * Align prompt with NoCaps

    commit 4e27457
    Author: Li Bo <drluodian@gmail.com>
    Date:   Wed Jan 24 22:10:14 2024 +0800

        [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

        * Merge commit '84cec070862dc1806761d9f0ee5f1df3b4c8ac0c'

        * Update dataset paths and improve user prompts

    commit 7d5058337d3de3cd4f0e85368e3dd463f34e703c
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit 73918654650daa0dad965d1b786d53e7c3585010
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit 4c712336b6f7438e717a865910bb241e413a4688
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:11 2024 +0000

    Add coco_val and coco_test tasks to coco.yaml

commit b5547126c855927fd4dc8384211e4aceee40870f
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 04:58:28 2024 +0000

    Update dataset_path in flickr30k.yaml

commit f786f61e2559f082072f21aa9030e2080ddaf809
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:12:25 2024 +0000

    Merge commit '8709dc0660676131a2d84126b6cf5ea2ee873c7f'

commit 796a011000e0df90f66f8e80cb34dc2318ae9ac8
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:10:18 2024 +0000

    Add submission folder and update file paths for storing prediction results

commit 8709dc0
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4e27457
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit '84cec070862dc1806761d9f0ee5f1df3b4c8ac0c'

    * Update dataset paths and improve user prompts

commit 118744c63eb2d9724571d85fbbd85fcc9ad05b59
Merge: c6370bf 51f2eaa
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:07 2024 +0800

    Merge branch 'main' into dev/bli_add_datasets

commit c6370bff65903681f00cf3d07111d8e15a57b619
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 14:08:06 2024 +0000

    Update dataset paths and improve user prompts

commit 810daf458fa94cb3ec2b4a6cc5ecb1e656a24002
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 11:52:33 2024 +0000

    Merge commit '84cec070862dc1806761d9f0ee5f1df3b4c8ac0c'

commit 84cec07
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (EvolvingLMMs-Lab#16)

    Update datetime format in get_datetime_str() function

* Fix bug in login functionality

* create vqav2_val

* Update vqav2_test.yaml

* Update vqav2_test.yaml

* Update vqav2_val.yaml

---------

Co-authored-by: Li Bo <drluodian@gmail.com>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
* Update tqdm progress bar position

* Merge commit '2037acaebc414280bd85e31b30ef9d2e671b3a19'

* Squashed commit of the following:

commit 15a5c86
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

commit 2037aca
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 5df364f
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit '1e0514f92df2bbcd3d1c1fc86e3212c5fed93eaf'

    * Update dataset paths and improve user prompts

commit 1e0514f
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (EvolvingLMMs-Lab#16)

    Update datetime format in get_datetime_str() function

* remove useless output file

* Update dataset path in vqav2.yaml

* Squashed commit of the following:

commit 69aeae7eb7dbf916c81e86820e4d56a8503c4538
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:56:45 2024 +0800

    Black lint

commit 6872c8515f3ae9044137a582a90487cd2795da72
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:47 2024 +0800

    Solve doc_iterator_for_counting crashing issue

commit e7c30645476c5eafcf623adc63cc765ca32b24b3
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:13 2024 +0800

    Exclude train in refcoco/+/g config

commit f593b6f1673dd66b593db2fe8a87bafec22b228b
Merge: 6a1ae69 697a438
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:13 2024 +0000

    Merge branch 'dev/bli_add_datasets' of https://github.com/EvolvingLMMs-Lab/lmms-eval into dev/bli_add_datasets

commit 4a717c6390d2b1af8ff8b60b73185a9ddadb670b
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:06 2024 +0000

    Fix file path and raise error if config file does not exist

commit 7336e827777d5293bd31137a771a40c23d52c104
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 00:47:24 2024 +0800

    Fix tasks issue for nocaps, refcoco/+/g

commit a06b559b508b71443254bf1ffb9be07460f63e77
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 10:09:43 2024 +0000

    Remove unused files and update task configuration

commit 359bc89bd5c703bcf89c620bd8e1f7ae803efef6
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:43:56 2024 +0000

    Add submission file for coco, flickr30k, nocaps, and textcaps tasks

commit 5759d3d5bcf9c030a59f268a7606f5166d896771
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:32:54 2024 +0000

    Refactor get_task_dict function to handle nested groups

commit 8ad0423836c858594d359038bcbf95018e41ce07
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:13:46 2024 +0000

    Fix bug in login functionality

    Refactor code for better performance

    Add new feature for user authentication

    Update UI layout for improved user experience

    Fix typo in variable name

    Optimize database queries for faster response time

    Add error handling for edge cases

    Update dependencies to latest versions

    Remove unused code

    Improve code readability and maintainability

commit 70894b138d6aa4e654444ee49de0471137987ebc
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 17:07:20 2024 +0800

    Fix cli itself can not run with config file

commit 955f34bbf69e5e7056eb6c4258940d254054be24
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:09:04 2024 +0000

    Squashed commit of the following:

    commit 15a5c86
    Author: Zhang Peiyuan <a1286225768@gmail.com>
    Date:   Thu Jan 25 17:08:25 2024 +0800

        add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

commit 003617d7cb7ee573953ef01fe99da260893ddf24
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:02:57 2024 +0000

    Squashed commit of the following:

    commit bf49735f01e8a523d01acadba47a410b1fa46434
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 8a7901e371f8f1e1c47442609cf5d007a5aee3df
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit fcd53e6e5a1a7b17e7a69c08eb306dd8ad3435c6
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit cbf0704d7b754b0d233f1643f3c3181fea8d02db
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 77cc77fe7c49d65b3275c333bb1ce93798d46994
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 100acee4869445bfa0a00aebdc1d36272f2af7ed
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit fc6d5dd1b7e142e0336c2099845cd2b89558a77b
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit 4d35cfef00c7bbe2d51d7e72b4df60fc30e0cea1
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit b57696a1735aa68637a9eb31dcf270dbd10febd4
Merge: 22c3adf 1d3fdd4
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:43:15 2024 +0000

    Merge commit '2037acaebc414280bd85e31b30ef9d2e671b3a19' into dev/bli_add_datasets

commit 7b0184493f0b5c06bf32cb711a877b4ef2360a82
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:52 2024 +0000

    Squashed commit of the following:

    commit bf49735f01e8a523d01acadba47a410b1fa46434
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 8a7901e371f8f1e1c47442609cf5d007a5aee3df
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit fcd53e6e5a1a7b17e7a69c08eb306dd8ad3435c6
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit cbf0704d7b754b0d233f1643f3c3181fea8d02db
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 77cc77fe7c49d65b3275c333bb1ce93798d46994
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 100acee4869445bfa0a00aebdc1d36272f2af7ed
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 2037aca
    Author: kcz358 <92624596+kcz358@users.noreply.github.com>
    Date:   Thu Jan 25 09:47:31 2024 +0800

        [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

        * Add flickr30k support

        * Black lint

        * Align prompt with NoCaps

    commit 5df364f
    Author: Li Bo <drluodian@gmail.com>
    Date:   Wed Jan 24 22:10:14 2024 +0800

        [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

        * Merge commit '1e0514f92df2bbcd3d1c1fc86e3212c5fed93eaf'

        * Update dataset paths and improve user prompts

    commit fc6d5dd1b7e142e0336c2099845cd2b89558a77b
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit 4d35cfef00c7bbe2d51d7e72b4df60fc30e0cea1
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit 27dd8d1264f84c46af49e0a94d32297c566379e9
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:11 2024 +0000

    Add coco_val and coco_test tasks to coco.yaml

commit a64326c69ced88a0037ba379447ec0c4db74ada6
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 04:58:28 2024 +0000

    Update dataset_path in flickr30k.yaml

commit c2e4dbb83d76e02e711845de7df6c6e27f417a3b
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:12:25 2024 +0000

    Merge commit '2037acaebc414280bd85e31b30ef9d2e671b3a19'

commit b6677e4f0ade1a4b86dff73d67140f35aa8a77ad
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:10:18 2024 +0000

    Add submission folder and update file paths for storing prediction results

commit 2037aca
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 5df364f
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit '1e0514f92df2bbcd3d1c1fc86e3212c5fed93eaf'

    * Update dataset paths and improve user prompts

commit 6b952d50a2e305eb8382b14e4a6ce9a3e7b6e080
Merge: c6370bf 51f2eaa
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:07 2024 +0800

    Merge branch 'main' into dev/bli_add_datasets

commit bcb0d81b71bb3209f814fb6e4889fb6ef54bb524
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 14:08:06 2024 +0000

    Update dataset paths and improve user prompts

commit 5fc18f2f0a8786e447a8feb312fcdc4538622f1f
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 11:52:33 2024 +0000

    Merge commit '1e0514f92df2bbcd3d1c1fc86e3212c5fed93eaf'

commit 1e0514f
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (EvolvingLMMs-Lab#16)

    Update datetime format in get_datetime_str() function

* Fix bug in login functionality

* create vqav2_val

* Update vqav2_test.yaml

* Update vqav2_test.yaml

* Update vqav2_val.yaml

---------

Co-authored-by: Li Bo <drluodian@gmail.com>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
* Update tqdm progress bar position

* Merge commit 'b2c71248314fc8f8461222e594c7ab046f5383f5'

* Squashed commit of the following:

commit 8dce2b0
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

commit b2c7124
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit d9c5827
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit '340c4501058e13bc64aad611c8bbb4d0059fc545'

    * Update dataset paths and improve user prompts

commit 340c450
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (EvolvingLMMs-Lab#16)

    Update datetime format in get_datetime_str() function

* remove useless output file

* Update dataset path in vqav2.yaml

* Squashed commit of the following:

commit 75bb7043ea5a533ab6351fc0f5ab055e86106423
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:56:45 2024 +0800

    Black lint

commit 6635a8aa34cfbd3c7a4afb6fcd214a7283ce01cb
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:47 2024 +0800

    Solve doc_iterator_for_counting crashing issue

commit 080f42b88ea8acacd527b8d67b84ba1d7d135b03
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:13 2024 +0800

    Exclude train in refcoco/+/g config

commit 4da84069c08c95e49e8ab0e64a1e103ff7ac8730
Merge: 6a1ae69 697a438
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:13 2024 +0000

    Merge branch 'dev/bli_add_datasets' of https://github.com/EvolvingLMMs-Lab/lmms-eval into dev/bli_add_datasets

commit 6a1ae69923d79ae32a001edac38206b605274ec3
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:06 2024 +0000

    Fix file path and raise error if config file does not exist

commit 697a4387827ceeec3e393237dd1baa217c714c88
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 00:47:24 2024 +0800

    Fix tasks issue for nocaps, refcoco/+/g

commit 47e40437126d39a5f062c9a33b4de426c1a29804
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 10:09:43 2024 +0000

    Remove unused files and update task configuration

commit 9976eb8e9ed03c8613725fdbd822ef5d8cf70e47
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:43:56 2024 +0000

    Add submission file for coco, flickr30k, nocaps, and textcaps tasks

commit 95f97a69faa6129676e89eee14960fcfe2076b7c
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:32:54 2024 +0000

    Refactor get_task_dict function to handle nested groups

commit 3b79ee842b2488714baf92ab34528ef77989d392
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:13:46 2024 +0000

    Fix bug in login functionality

    Refactor code for better performance

    Add new feature for user authentication

    Update UI layout for improved user experience

    Fix typo in variable name

    Optimize database queries for faster response time

    Add error handling for edge cases

    Update dependencies to latest versions

    Remove unused code

    Improve code readability and maintainability

commit f5c353f2ce93a2d96add4312b695b57432f68cbb
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 17:07:20 2024 +0800

    Fix cli itself can not run with config file

commit 9a68fec37be74cfe8d4a73390bc83edee147ae24
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:09:04 2024 +0000

    Squashed commit of the following:

    commit 8dce2b0
    Author: Zhang Peiyuan <a1286225768@gmail.com>
    Date:   Thu Jan 25 17:08:25 2024 +0800

        add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

commit 93f847c5851fd246716367935d6b807b17d53949
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:02:57 2024 +0000

    Squashed commit of the following:

    commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit 63739fc6fa0a462d807ae81de0db0173102de584
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit edcc752f97ea3845cefad56624e5d2855066f680
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit facd3d87fef5f4eb82dbe3b236a6b199dc87863e
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit 3380863c2ca0f3b98d74f94c9e72460d28d34acd
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit fa4ad4404e26d8924f55208746dbb9143b464011
Merge: 22c3adf 1d3fdd4
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:43:15 2024 +0000

    Merge commit 'b2c71248314fc8f8461222e594c7ab046f5383f5' into dev/bli_add_datasets

commit 22c3adfd0645acc23b6d7c06b487f4ffd47666c4
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:52 2024 +0000

    Squashed commit of the following:

    commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit 63739fc6fa0a462d807ae81de0db0173102de584
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit edcc752f97ea3845cefad56624e5d2855066f680
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit b2c7124
    Author: kcz358 <92624596+kcz358@users.noreply.github.com>
    Date:   Thu Jan 25 09:47:31 2024 +0800

        [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

        * Add flickr30k support

        * Black lint

        * Align prompt with NoCaps

    commit d9c5827
    Author: Li Bo <drluodian@gmail.com>
    Date:   Wed Jan 24 22:10:14 2024 +0800

        [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

        * Merge commit '340c4501058e13bc64aad611c8bbb4d0059fc545'

        * Update dataset paths and improve user prompts

    commit facd3d87fef5f4eb82dbe3b236a6b199dc87863e
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit 3380863c2ca0f3b98d74f94c9e72460d28d34acd
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit 4c712336b6f7438e717a865910bb241e413a4688
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:11 2024 +0000

    Add coco_val and coco_test tasks to coco.yaml

commit b5547126c855927fd4dc8384211e4aceee40870f
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 04:58:28 2024 +0000

    Update dataset_path in flickr30k.yaml

commit f786f61e2559f082072f21aa9030e2080ddaf809
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:12:25 2024 +0000

    Merge commit 'b2c71248314fc8f8461222e594c7ab046f5383f5'

commit 796a011000e0df90f66f8e80cb34dc2318ae9ac8
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:10:18 2024 +0000

    Add submission folder and update file paths for storing prediction results

commit b2c7124
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit d9c5827
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit '340c4501058e13bc64aad611c8bbb4d0059fc545'

    * Update dataset paths and improve user prompts

commit 118744c63eb2d9724571d85fbbd85fcc9ad05b59
Merge: c6370bf 51f2eaa
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:07 2024 +0800

    Merge branch 'main' into dev/bli_add_datasets

commit c6370bff65903681f00cf3d07111d8e15a57b619
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 14:08:06 2024 +0000

    Update dataset paths and improve user prompts

commit 810daf458fa94cb3ec2b4a6cc5ecb1e656a24002
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 11:52:33 2024 +0000

    Merge commit '340c4501058e13bc64aad611c8bbb4d0059fc545'

commit 340c450
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (EvolvingLMMs-Lab#16)

    Update datetime format in get_datetime_str() function

* Fix bug in login functionality

* create vqav2_val

* Update vqav2_test.yaml

* Update vqav2_test.yaml

* Update vqav2_val.yaml

---------

Co-authored-by: Li Bo <drluodian@gmail.com>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
* Update tqdm progress bar position

* Merge commit '82108537ee4e3d54d6378fb7faa78199e00a3e8b'

* Squashed commit of the following:

commit 03edad8
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

commit 8210853
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 158c42d
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit '76c213db0f1495c1ececf0b58678f87cc6144e3c'

    * Update dataset paths and improve user prompts

commit 76c213d
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (EvolvingLMMs-Lab#16)

    Update datetime format in get_datetime_str() function

* remove useless output file

* Update dataset path in vqav2.yaml

* Squashed commit of the following:

commit 122f420e8450d70eeee97d0e33d30772f781358d
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:56:45 2024 +0800

    Black lint

commit 363c608444b6df57d51e53f2adb8d8cbfeda0852
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:47 2024 +0800

    Solve doc_iterator_for_counting crashing issue

commit 87d7ee9a776438ad39d4f275f6dc589433f30931
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:13 2024 +0800

    Exclude train in refcoco/+/g config

commit d5571773a0d095a288be47883ffdf53f07a077ee
Merge: 6a1ae69 697a438
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:13 2024 +0000

    Merge branch 'dev/bli_add_datasets' of https://github.com/EvolvingLMMs-Lab/lmms-eval into dev/bli_add_datasets

commit e8710b2ab15bf0bfdb30a60fbd18dcec404bd2ae
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:06 2024 +0000

    Fix file path and raise error if config file does not exist

commit 3f3f7481a933a368b5f1b8f267f8003ea0ef82f4
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 00:47:24 2024 +0800

    Fix tasks issue for nocaps, refcoco/+/g

commit c0dcef296c7076d5ad9992c1afef229e447b9851
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 10:09:43 2024 +0000

    Remove unused files and update task configuration

commit 59e99665b53fb0d1b95f59ae7c2bfffdb1f6d93b
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:43:56 2024 +0000

    Add submission file for coco, flickr30k, nocaps, and textcaps tasks

commit 68364a32bd346514262a957e89899a4c1c057bf9
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:32:54 2024 +0000

    Refactor get_task_dict function to handle nested groups

commit 18f51856fef5fd1b62b1189068c9837fb67195e3
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:13:46 2024 +0000

    Fix bug in login functionality

    Refactor code for better performance

    Add new feature for user authentication

    Update UI layout for improved user experience

    Fix typo in variable name

    Optimize database queries for faster response time

    Add error handling for edge cases

    Update dependencies to latest versions

    Remove unused code

    Improve code readability and maintainability

commit 927d9e80856e6b14fd81ddac273ba2468dddc076
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 17:07:20 2024 +0800

    Fix cli itself can not run with config file

commit 589b3a2a124c241441b3180a67bab57412bbe5ef
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:09:04 2024 +0000

    Squashed commit of the following:

    commit 03edad8
    Author: Zhang Peiyuan <a1286225768@gmail.com>
    Date:   Thu Jan 25 17:08:25 2024 +0800

        add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

commit 4c9c7db3ba26973d130a111506b4d5d77ab00c95
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:02:57 2024 +0000

    Squashed commit of the following:

    commit a9bdc9b952df662cd7156ccc63af31ae0a83d2ff
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 7313f07606ec94f555d50d4523adcb2c1714922e
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit 69f7f0be0eaa855c6c46e7c748a7ac69a04606e8
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit 9173602a072c669f3348a58b715c77cfef4f0fbf
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 9386d0011c4d6ed7190373d0951d903c7548ccb5
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 77079bc826943e187247863d5473237de05b3cf2
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 1a284e6a412da3cc503297f33417dad19dd59aee
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit d1c04e8c8e509a375c117020b3c241cc736f9365
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit 4f7ad7f56de08cddd2b3af64635d0d3a2c37ddb7
Merge: 22c3adf 1d3fdd4
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:43:15 2024 +0000

    Merge commit '82108537ee4e3d54d6378fb7faa78199e00a3e8b' into dev/bli_add_datasets

commit e7cd3c23c345d1ed54e9085ac0cf28006489c434
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:52 2024 +0000

    Squashed commit of the following:

    commit a9bdc9b952df662cd7156ccc63af31ae0a83d2ff
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 7313f07606ec94f555d50d4523adcb2c1714922e
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit 69f7f0be0eaa855c6c46e7c748a7ac69a04606e8
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit 9173602a072c669f3348a58b715c77cfef4f0fbf
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 9386d0011c4d6ed7190373d0951d903c7548ccb5
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 77079bc826943e187247863d5473237de05b3cf2
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 8210853
    Author: kcz358 <92624596+kcz358@users.noreply.github.com>
    Date:   Thu Jan 25 09:47:31 2024 +0800

        [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

        * Add flickr30k support

        * Black lint

        * Align prompt with NoCaps

    commit 158c42d
    Author: Li Bo <drluodian@gmail.com>
    Date:   Wed Jan 24 22:10:14 2024 +0800

        [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

        * Merge commit '76c213db0f1495c1ececf0b58678f87cc6144e3c'

        * Update dataset paths and improve user prompts

    commit 1a284e6a412da3cc503297f33417dad19dd59aee
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit d1c04e8c8e509a375c117020b3c241cc736f9365
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit 33b1143ba4f86461cc37c5b4f86c3a20523768e5
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:11 2024 +0000

    Add coco_val and coco_test tasks to coco.yaml

commit b163d1076b6df227914352cb7a23e5cbc282c683
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 04:58:28 2024 +0000

    Update dataset_path in flickr30k.yaml

commit 44df7c8fa06090b48a47c5c87f988d2e14c663f9
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:12:25 2024 +0000

    Merge commit '82108537ee4e3d54d6378fb7faa78199e00a3e8b'

commit 96626ddda0df111ef5498f294e58ed01b51bdbbd
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:10:18 2024 +0000

    Add submission folder and update file paths for storing prediction results

commit 8210853
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 158c42d
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit '76c213db0f1495c1ececf0b58678f87cc6144e3c'

    * Update dataset paths and improve user prompts

commit 739dc3f823ab434707c23160c1bf51712ccbdc43
Merge: c6370bf 51f2eaa
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:07 2024 +0800

    Merge branch 'main' into dev/bli_add_datasets

commit 7ae42b4e7f1895429ffaa4ffb2d57f6aab2a470c
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 14:08:06 2024 +0000

    Update dataset paths and improve user prompts

commit 55d8c58b5446e432de7d397eb251028608d08edd
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 11:52:33 2024 +0000

    Merge commit '76c213db0f1495c1ececf0b58678f87cc6144e3c'

commit 76c213d
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (EvolvingLMMs-Lab#16)

    Update datetime format in get_datetime_str() function

* Fix bug in login functionality

* create vqav2_val

* Update vqav2_test.yaml

* Update vqav2_test.yaml

* Update vqav2_val.yaml

---------

Co-authored-by: Li Bo <drluodian@gmail.com>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
* Update tqdm progress bar position

* Merge commit '52ee4a18dad22b2399a4248d2aa9204dbfe88624'

* Squashed commit of the following:

commit f7a7db5
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

commit 52ee4a1
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 04303b0
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit 'bee5794a597d8a87794b4bcd9b57a1553efad857'

    * Update dataset paths and improve user prompts

commit bee5794
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (EvolvingLMMs-Lab#16)

    Update datetime format in get_datetime_str() function

* remove useless output file

* Update dataset path in vqav2.yaml

* Squashed commit of the following:

commit 75bb7043ea5a533ab6351fc0f5ab055e86106423
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:56:45 2024 +0800

    Black lint

commit 6635a8aa34cfbd3c7a4afb6fcd214a7283ce01cb
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:47 2024 +0800

    Solve doc_iterator_for_counting crashing issue

commit 080f42b88ea8acacd527b8d67b84ba1d7d135b03
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:13 2024 +0800

    Exclude train in refcoco/+/g config

commit 4da84069c08c95e49e8ab0e64a1e103ff7ac8730
Merge: 6a1ae69 697a438
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:13 2024 +0000

    Merge branch 'dev/bli_add_datasets' of https://github.com/EvolvingLMMs-Lab/lmms-eval into dev/bli_add_datasets

commit 6a1ae69923d79ae32a001edac38206b605274ec3
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:06 2024 +0000

    Fix file path and raise error if config file does not exist

commit 697a4387827ceeec3e393237dd1baa217c714c88
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 00:47:24 2024 +0800

    Fix tasks issue for nocaps, refcoco/+/g

commit 47e40437126d39a5f062c9a33b4de426c1a29804
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 10:09:43 2024 +0000

    Remove unused files and update task configuration

commit 9976eb8e9ed03c8613725fdbd822ef5d8cf70e47
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:43:56 2024 +0000

    Add submission file for coco, flickr30k, nocaps, and textcaps tasks

commit 95f97a69faa6129676e89eee14960fcfe2076b7c
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:32:54 2024 +0000

    Refactor get_task_dict function to handle nested groups

commit 3b79ee842b2488714baf92ab34528ef77989d392
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:13:46 2024 +0000

    Fix bug in login functionality

    Refactor code for better performance

    Add new feature for user authentication

    Update UI layout for improved user experience

    Fix typo in variable name

    Optimize database queries for faster response time

    Add error handling for edge cases

    Update dependencies to latest versions

    Remove unused code

    Improve code readability and maintainability

commit f5c353f2ce93a2d96add4312b695b57432f68cbb
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 17:07:20 2024 +0800

    Fix cli itself can not run with config file

commit 9a68fec37be74cfe8d4a73390bc83edee147ae24
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:09:04 2024 +0000

    Squashed commit of the following:

    commit f7a7db5
    Author: Zhang Peiyuan <a1286225768@gmail.com>
    Date:   Thu Jan 25 17:08:25 2024 +0800

        add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

commit 93f847c5851fd246716367935d6b807b17d53949
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:02:57 2024 +0000

    Squashed commit of the following:

    commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit 63739fc6fa0a462d807ae81de0db0173102de584
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit edcc752f97ea3845cefad56624e5d2855066f680
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 7d5058337d3de3cd4f0e85368e3dd463f34e703c
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit 73918654650daa0dad965d1b786d53e7c3585010
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit fa4ad4404e26d8924f55208746dbb9143b464011
Merge: 22c3adf 1d3fdd4
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:43:15 2024 +0000

    Merge commit '52ee4a18dad22b2399a4248d2aa9204dbfe88624' into dev/bli_add_datasets

commit 22c3adfd0645acc23b6d7c06b487f4ffd47666c4
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:52 2024 +0000

    Squashed commit of the following:

    commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit 63739fc6fa0a462d807ae81de0db0173102de584
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit edcc752f97ea3845cefad56624e5d2855066f680
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 52ee4a1
    Author: kcz358 <92624596+kcz358@users.noreply.github.com>
    Date:   Thu Jan 25 09:47:31 2024 +0800

        [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

        * Add flickr30k support

        * Black lint

        * Align prompt with NoCaps

    commit 04303b0
    Author: Li Bo <drluodian@gmail.com>
    Date:   Wed Jan 24 22:10:14 2024 +0800

        [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

        * Merge commit 'bee5794a597d8a87794b4bcd9b57a1553efad857'

        * Update dataset paths and improve user prompts

    commit 7d5058337d3de3cd4f0e85368e3dd463f34e703c
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit 73918654650daa0dad965d1b786d53e7c3585010
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit 4c712336b6f7438e717a865910bb241e413a4688
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:11 2024 +0000

    Add coco_val and coco_test tasks to coco.yaml

commit b5547126c855927fd4dc8384211e4aceee40870f
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 04:58:28 2024 +0000

    Update dataset_path in flickr30k.yaml

commit f786f61e2559f082072f21aa9030e2080ddaf809
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:12:25 2024 +0000

    Merge commit '52ee4a18dad22b2399a4248d2aa9204dbfe88624'

commit 796a011000e0df90f66f8e80cb34dc2318ae9ac8
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:10:18 2024 +0000

    Add submission folder and update file paths for storing prediction results

commit 52ee4a1
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 04303b0
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit 'bee5794a597d8a87794b4bcd9b57a1553efad857'

    * Update dataset paths and improve user prompts

commit 118744c63eb2d9724571d85fbbd85fcc9ad05b59
Merge: c6370bf 51f2eaa
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:07 2024 +0800

    Merge branch 'main' into dev/bli_add_datasets

commit c6370bff65903681f00cf3d07111d8e15a57b619
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 14:08:06 2024 +0000

    Update dataset paths and improve user prompts

commit 810daf458fa94cb3ec2b4a6cc5ecb1e656a24002
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 11:52:33 2024 +0000

    Merge commit 'bee5794a597d8a87794b4bcd9b57a1553efad857'

commit bee5794
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (EvolvingLMMs-Lab#16)

    Update datetime format in get_datetime_str() function

* Fix bug in login functionality

* create vqav2_val

* Update vqav2_test.yaml

* Update vqav2_test.yaml

* Update vqav2_val.yaml

---------

Co-authored-by: Li Bo <drluodian@gmail.com>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
* Update tqdm progress bar position

* Merge commit 'bfdf75d7b67680cdc98fdf3f58458633bb492de6'

* Squashed commit of the following:

commit 19db53b
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

commit bfdf75d
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit f69268b
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit '95f3d3e116db32b49631f2005c9b2a608f778cc0'

    * Update dataset paths and improve user prompts

commit 95f3d3e
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (EvolvingLMMs-Lab#16)

    Update datetime format in get_datetime_str() function

* remove useless output file

* Update dataset path in vqav2.yaml

* Squashed commit of the following:

commit 9e827183b527e9a035a6359448c1e692df089ed1
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:56:45 2024 +0800

    Black lint

commit 570500320783a594f218699ea1509ec537591b2e
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:47 2024 +0800

    Solve doc_iterator_for_counting crashing issue

commit 0e75485613ff06b532403a152974eedf8e117c9c
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:13 2024 +0800

    Exclude train in refcoco/+/g config

commit 6429b7e69ddc0eee6a6728772ec5eb2114d6e331
Merge: 6a1ae69 697a438
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:13 2024 +0000

    Merge branch 'dev/bli_add_datasets' of https://github.com/EvolvingLMMs-Lab/lmms-eval into dev/bli_add_datasets

commit 892bc90979fd6b5b64de0ed68b17ac2944b9e6fa
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:06 2024 +0000

    Fix file path and raise error if config file does not exist

commit aff94aaf134bb404e48cd59d931cd214197df339
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 00:47:24 2024 +0800

    Fix tasks issue for nocaps, refcoco/+/g

commit d0dc730cbee420e7121b0520eb40a1f30447930d
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 10:09:43 2024 +0000

    Remove unused files and update task configuration

commit c69ecbfc52492aca3e5ecfc8d425ee9e7af00978
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:43:56 2024 +0000

    Add submission file for coco, flickr30k, nocaps, and textcaps tasks

commit 9053bc9aafb19d654b30927a8fec72347c745886
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:32:54 2024 +0000

    Refactor get_task_dict function to handle nested groups

commit bbf0dbb9e7d05ce6aecd251815a66ac38e9a4169
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:13:46 2024 +0000

    Fix bug in login functionality

    Refactor code for better performance

    Add new feature for user authentication

    Update UI layout for improved user experience

    Fix typo in variable name

    Optimize database queries for faster response time

    Add error handling for edge cases

    Update dependencies to latest versions

    Remove unused code

    Improve code readability and maintainability

commit ee76ebb5bd120708d07477e1462e986ece346975
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 17:07:20 2024 +0800

    Fix cli itself can not run with config file

commit d252441a31ea5ab29bd32accb5b0b9e1ba73587b
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:09:04 2024 +0000

    Squashed commit of the following:

    commit 19db53b
    Author: Zhang Peiyuan <a1286225768@gmail.com>
    Date:   Thu Jan 25 17:08:25 2024 +0800

        add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

commit 3278cccfcd5454ab972071555918fc8571f94d37
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:02:57 2024 +0000

    Squashed commit of the following:

    commit 11795cb69caaaceddf6b284f18a386c7787d476d
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit fb19895ca28ecf64d2ea5322e5391f7742e540f4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit e02df3b556a9d34d32d8bfa1f99ea992b763bc6f
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit 388a23ac4bb47644826869562c70c10b470a1817
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit bcb7df038402c5ef73db230126fcd76795ee69df
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 30056b56be382107f520d5c85b84c3d541d970e9
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 53ddf3fb2716fd99b2fa454656312d6fc92227b7
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit d7bbd3b2cbd78fdc3df2137ac0d625b5f5505acc
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit 741278f40ef70df04efd52ddd79e3c260c41a53e
Merge: 22c3adf 1d3fdd4
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:43:15 2024 +0000

    Merge commit 'bfdf75d7b67680cdc98fdf3f58458633bb492de6' into dev/bli_add_datasets

commit cbdaa28e87913c26dd6d2de6bd7c2b3acb556b0a
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:52 2024 +0000

    Squashed commit of the following:

    commit 11795cb69caaaceddf6b284f18a386c7787d476d
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit fb19895ca28ecf64d2ea5322e5391f7742e540f4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit e02df3b556a9d34d32d8bfa1f99ea992b763bc6f
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit 388a23ac4bb47644826869562c70c10b470a1817
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit bcb7df038402c5ef73db230126fcd76795ee69df
    Merge: 7e8b57d 1d3fdd4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 30056b56be382107f520d5c85b84c3d541d970e9
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit bfdf75d
    Author: kcz358 <92624596+kcz358@users.noreply.github.com>
    Date:   Thu Jan 25 09:47:31 2024 +0800

        [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

        * Add flickr30k support

        * Black lint

        * Align prompt with NoCaps

    commit f69268b
    Author: Li Bo <drluodian@gmail.com>
    Date:   Wed Jan 24 22:10:14 2024 +0800

        [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

        * Merge commit '95f3d3e116db32b49631f2005c9b2a608f778cc0'

        * Update dataset paths and improve user prompts

    commit 53ddf3fb2716fd99b2fa454656312d6fc92227b7
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit d7bbd3b2cbd78fdc3df2137ac0d625b5f5505acc
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit b8389cf8dac3f22c8d07f9789fdd877d8298d786
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:11 2024 +0000

    Add coco_val and coco_test tasks to coco.yaml

commit f399ed85ace060b3e64bd5468b17f2a856d005bd
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 04:58:28 2024 +0000

    Update dataset_path in flickr30k.yaml

commit 4657c9b111bac762f3dc5ff9397ea211b2b62656
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:12:25 2024 +0000

    Merge commit 'bfdf75d7b67680cdc98fdf3f58458633bb492de6'

commit 9b3a02280e05f15e305eb86a3669e76f011c6444
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:10:18 2024 +0000

    Add submission folder and update file paths for storing prediction results

commit bfdf75d
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit f69268b
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit '95f3d3e116db32b49631f2005c9b2a608f778cc0'

    * Update dataset paths and improve user prompts

commit ad4a267e810a4653e5d7ad0b5b9000ea0a39028e
Merge: c6370bf 51f2eaa
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:07 2024 +0800

    Merge branch 'main' into dev/bli_add_datasets

commit b441be2447ef78dce4c9c8134ad34cfd20765eef
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 14:08:06 2024 +0000

    Update dataset paths and improve user prompts

commit 9e30e09b429b30cc67389af0ebc94a1149dcc4bb
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 11:52:33 2024 +0000

    Merge commit '95f3d3e116db32b49631f2005c9b2a608f778cc0'

commit 95f3d3e
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (EvolvingLMMs-Lab#16)

    Update datetime format in get_datetime_str() function

* Fix bug in login functionality

* create vqav2_val

* Update vqav2_test.yaml

* Update vqav2_test.yaml

* Update vqav2_val.yaml

---------

Co-authored-by: Li Bo <drluodian@gmail.com>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
* Merge commit 'a0b87f52d0c7cde3c320aeac77eb11165e5bb3ef'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit '4d11dcea8db1a7e4b7347f3c9880788e8cde5d9f'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 4d11dce
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 4d11dce
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 7c68ea1
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit 'a0b87f52d0c7cde3c320aeac77eb11165e5bb3ef'

    * Update dataset paths and improve user prompts

commit 2c63fe7f7b6313ce772edeb41974ba0b08b8c469
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit c524ca948439157c24faad9b2fc41c7c139e0ed1
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 4d11dce
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 2c63fe7f7b6313ce772edeb41974ba0b08b8c469
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit c524ca948439157c24faad9b2fc41c7c139e0ed1
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 95460de
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit aaf199c777fe7b81e1ad39bd72cf2cd1daf30d69
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit 07b5317f2d9f85465b35dcb2e11cf0d3a51aeb2a
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 6126fe6d8bdf09825855236377cb78b5e4b242ed
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit 95460de
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit c9f49774bfa0f505fb266871f3e56ae5a397a97b
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit d2a852842282e211ca885180db1aba4b1d1f8c2b
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit 8ef634ccbe2bd5f1159674f1ce70349d7adf935f
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit f49f4961d921b7c8196c1484418ec1673e5e4b74
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit 368690aad385c5e1972fe5394b94a8eb1a47efca
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit 47463754525984a17f790c5dace6ff05b1ce72f7
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit 95460de
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Update lmms_eval/evaluator.py and lmms_eval/tasks/vizwizvqa/utils.py

* vizwiz-val

* Update utils.py

* Update vizwizvqa.yaml

---------

Co-authored-by: Bo Li <drluodian@gmail.com>
Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
* Merge commit 'e546b08ca8286fe2e4d0943ad9b41667d275f65a'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit 'f9c9014ba3566cb1bf1f19bf0d85c6e54ce7c8b4'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit a9bdc9b952df662cd7156ccc63af31ae0a83d2ff
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 7313f07606ec94f555d50d4523adcb2c1714922e
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 69f7f0be0eaa855c6c46e7c748a7ac69a04606e8
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 9173602a072c669f3348a58b715c77cfef4f0fbf
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 9386d0011c4d6ed7190373d0951d903c7548ccb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 77079bc826943e187247863d5473237de05b3cf2
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit f9c9014
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4a97197
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit 'e546b08ca8286fe2e4d0943ad9b41667d275f65a'

    * Update dataset paths and improve user prompts

commit 1a284e6a412da3cc503297f33417dad19dd59aee
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit d1c04e8c8e509a375c117020b3c241cc736f9365
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit a9bdc9b952df662cd7156ccc63af31ae0a83d2ff
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 7313f07606ec94f555d50d4523adcb2c1714922e
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 69f7f0be0eaa855c6c46e7c748a7ac69a04606e8
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 9173602a072c669f3348a58b715c77cfef4f0fbf
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 9386d0011c4d6ed7190373d0951d903c7548ccb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 77079bc826943e187247863d5473237de05b3cf2
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 1a284e6a412da3cc503297f33417dad19dd59aee
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit d1c04e8c8e509a375c117020b3c241cc736f9365
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 0e74884
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit a2cc9303dc72e4d53983bb56e54a32e977c3e270
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit 35e87e7c7a480d005abf607c2527a35457d92311
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 89755323596b85208ed33aa88c296604a39af6eb
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit 0e74884
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 0b0d30dfb247c5f0b7b68398b9e9fcde74cf7fa2
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit e273f9cbd91540df86bdbc652bff88a847bd0d2d
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit e84126aaaf8a07bd371a0571a914ccbcd3697f20
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 110deab53dc1a2fd349b1872cd261b69074c5fa8
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit 0fa3e0c40075997ea80ed976bdee9615f17d3ece
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit 2aaca579120def99860f90054233f3358950fa66
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit 0e74884
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Update lmms_eval/evaluator.py and lmms_eval/tasks/vizwizvqa/utils.py

* vizwiz-val

* Update utils.py

* Update vizwizvqa.yaml

---------

Co-authored-by: Bo Li <drluodian@gmail.com>
Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
* Merge commit '84cec070862dc1806761d9f0ee5f1df3b4c8ac0c'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit '8709dc0660676131a2d84126b6cf5ea2ee873c7f'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 8709dc0
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4e27457
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit '84cec070862dc1806761d9f0ee5f1df3b4c8ac0c'

    * Update dataset paths and improve user prompts

commit 7d5058337d3de3cd4f0e85368e3dd463f34e703c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 73918654650daa0dad965d1b786d53e7c3585010
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 7d5058337d3de3cd4f0e85368e3dd463f34e703c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 73918654650daa0dad965d1b786d53e7c3585010
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 7021e8e
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit 35c3c7098e489ddc552778ea801a6acb6a25a9d9
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit 25d9de0b0ea4418e4b1b6f74bdb0dd4c835f66a9
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit aad562494c54d6ddd8cc9b9558a2a300e65f2ea2
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit 7021e8e
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 40d1888f2e83dadac572c08b7e1f0ae6e2b4d504
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit 02b00db5c3c2dce5ab4c2db6a3eacc7d0b735942
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit f35878778fc0179381b8f3d61d222000b1773774
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 64fb8196c4d9a943fa11a1d0b0fd2a065ed37847
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit f79ece372f140427c9461aa652fe1a9e8a312b3d
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit 028007a0352365dd42a968df6000eb66c9d30e2b
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit 7021e8e
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Update lmms_eval/evaluator.py and lmms_eval/tasks/vizwizvqa/utils.py

* vizwiz-val

* Update utils.py

* Update vizwizvqa.yaml

---------

Co-authored-by: Bo Li <drluodian@gmail.com>
Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
* Merge commit '1e0514f92df2bbcd3d1c1fc86e3212c5fed93eaf'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit '2037acaebc414280bd85e31b30ef9d2e671b3a19'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit bf49735f01e8a523d01acadba47a410b1fa46434
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 8a7901e371f8f1e1c47442609cf5d007a5aee3df
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit fcd53e6e5a1a7b17e7a69c08eb306dd8ad3435c6
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit cbf0704d7b754b0d233f1643f3c3181fea8d02db
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 77cc77fe7c49d65b3275c333bb1ce93798d46994
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 100acee4869445bfa0a00aebdc1d36272f2af7ed
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 2037aca
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 5df364f
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit '1e0514f92df2bbcd3d1c1fc86e3212c5fed93eaf'

    * Update dataset paths and improve user prompts

commit fc6d5dd1b7e142e0336c2099845cd2b89558a77b
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 4d35cfef00c7bbe2d51d7e72b4df60fc30e0cea1
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit bf49735f01e8a523d01acadba47a410b1fa46434
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 8a7901e371f8f1e1c47442609cf5d007a5aee3df
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit fcd53e6e5a1a7b17e7a69c08eb306dd8ad3435c6
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit cbf0704d7b754b0d233f1643f3c3181fea8d02db
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 77cc77fe7c49d65b3275c333bb1ce93798d46994
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 100acee4869445bfa0a00aebdc1d36272f2af7ed
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit fc6d5dd1b7e142e0336c2099845cd2b89558a77b
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 4d35cfef00c7bbe2d51d7e72b4df60fc30e0cea1
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 15a5c86
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit a2cc9303dc72e4d53983bb56e54a32e977c3e270
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit 35e87e7c7a480d005abf607c2527a35457d92311
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 89755323596b85208ed33aa88c296604a39af6eb
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit 15a5c86
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 0b0d30dfb247c5f0b7b68398b9e9fcde74cf7fa2
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit e273f9cbd91540df86bdbc652bff88a847bd0d2d
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit e84126aaaf8a07bd371a0571a914ccbcd3697f20
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 110deab53dc1a2fd349b1872cd261b69074c5fa8
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit 0fa3e0c40075997ea80ed976bdee9615f17d3ece
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit 2aaca579120def99860f90054233f3358950fa66
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit 15a5c86
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Update lmms_eval/evaluator.py and lmms_eval/tasks/vizwizvqa/utils.py

* vizwiz-val

* Update utils.py

* Update vizwizvqa.yaml

---------

Co-authored-by: Bo Li <drluodian@gmail.com>
Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
* Merge commit '340c4501058e13bc64aad611c8bbb4d0059fc545'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit 'b2c71248314fc8f8461222e594c7ab046f5383f5'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit b2c7124
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit d9c5827
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit '340c4501058e13bc64aad611c8bbb4d0059fc545'

    * Update dataset paths and improve user prompts

commit facd3d87fef5f4eb82dbe3b236a6b199dc87863e
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 3380863c2ca0f3b98d74f94c9e72460d28d34acd
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit facd3d87fef5f4eb82dbe3b236a6b199dc87863e
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 3380863c2ca0f3b98d74f94c9e72460d28d34acd
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 8dce2b0
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit 6f66c1130070307ba51eae79f54e197f0053266b
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit a6d360d7b1092d5656e4b4ad7d8964f44ee0a3dc
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 7ed11f762e3af8b9a2261793c5bbc9c3ebc2c512
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit 8dce2b0
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 963fd932338aae1dee007bbb574daec162cb58bb
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit 1481d73aef646233dce05b3b2989a9e8eddcab2b
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit 45a3bf24b4c6e610237e2ef81f1b01cf11ee25d9
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 63080782e2d7544d58c513648dd64647131d6337
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit ef60547ab60a4a5e18de1634c8126ad5cbc1139c
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit 7d2e92c2835f88cd7832ddab0874996b308faa9a
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit 8dce2b0
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Update lmms_eval/evaluator.py and lmms_eval/tasks/vizwizvqa/utils.py

* vizwiz-val

* Update utils.py

* Update vizwizvqa.yaml

---------

Co-authored-by: Bo Li <drluodian@gmail.com>
Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
* Merge commit '76c213db0f1495c1ececf0b58678f87cc6144e3c'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit '82108537ee4e3d54d6378fb7faa78199e00a3e8b'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit a9bdc9b952df662cd7156ccc63af31ae0a83d2ff
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 7313f07606ec94f555d50d4523adcb2c1714922e
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 69f7f0be0eaa855c6c46e7c748a7ac69a04606e8
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 9173602a072c669f3348a58b715c77cfef4f0fbf
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 9386d0011c4d6ed7190373d0951d903c7548ccb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 77079bc826943e187247863d5473237de05b3cf2
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 8210853
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 158c42d
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit '76c213db0f1495c1ececf0b58678f87cc6144e3c'

    * Update dataset paths and improve user prompts

commit 1a284e6a412da3cc503297f33417dad19dd59aee
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit d1c04e8c8e509a375c117020b3c241cc736f9365
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit a9bdc9b952df662cd7156ccc63af31ae0a83d2ff
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 7313f07606ec94f555d50d4523adcb2c1714922e
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 69f7f0be0eaa855c6c46e7c748a7ac69a04606e8
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 9173602a072c669f3348a58b715c77cfef4f0fbf
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 9386d0011c4d6ed7190373d0951d903c7548ccb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 77079bc826943e187247863d5473237de05b3cf2
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 1a284e6a412da3cc503297f33417dad19dd59aee
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit d1c04e8c8e509a375c117020b3c241cc736f9365
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 03edad8
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit a2cc9303dc72e4d53983bb56e54a32e977c3e270
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit 35e87e7c7a480d005abf607c2527a35457d92311
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 89755323596b85208ed33aa88c296604a39af6eb
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit 03edad8
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 0b0d30dfb247c5f0b7b68398b9e9fcde74cf7fa2
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit e273f9cbd91540df86bdbc652bff88a847bd0d2d
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit e84126aaaf8a07bd371a0571a914ccbcd3697f20
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 110deab53dc1a2fd349b1872cd261b69074c5fa8
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit 0fa3e0c40075997ea80ed976bdee9615f17d3ece
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit 2aaca579120def99860f90054233f3358950fa66
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit 03edad8
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Update lmms_eval/evaluator.py and lmms_eval/tasks/vizwizvqa/utils.py

* vizwiz-val

* Update utils.py

* Update vizwizvqa.yaml

---------

Co-authored-by: Bo Li <drluodian@gmail.com>
Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
* Merge commit 'bee5794a597d8a87794b4bcd9b57a1553efad857'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit '52ee4a18dad22b2399a4248d2aa9204dbfe88624'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 52ee4a1
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 04303b0
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit 'bee5794a597d8a87794b4bcd9b57a1553efad857'

    * Update dataset paths and improve user prompts

commit 7d5058337d3de3cd4f0e85368e3dd463f34e703c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 73918654650daa0dad965d1b786d53e7c3585010
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 7d5058337d3de3cd4f0e85368e3dd463f34e703c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 73918654650daa0dad965d1b786d53e7c3585010
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit f7a7db5
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit 35c3c7098e489ddc552778ea801a6acb6a25a9d9
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit 25d9de0b0ea4418e4b1b6f74bdb0dd4c835f66a9
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit aad562494c54d6ddd8cc9b9558a2a300e65f2ea2
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit f7a7db5
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 40d1888f2e83dadac572c08b7e1f0ae6e2b4d504
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit 02b00db5c3c2dce5ab4c2db6a3eacc7d0b735942
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit f35878778fc0179381b8f3d61d222000b1773774
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 64fb8196c4d9a943fa11a1d0b0fd2a065ed37847
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit f79ece372f140427c9461aa652fe1a9e8a312b3d
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit 028007a0352365dd42a968df6000eb66c9d30e2b
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit f7a7db5
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Update lmms_eval/evaluator.py and lmms_eval/tasks/vizwizvqa/utils.py

* vizwiz-val

* Update utils.py

* Update vizwizvqa.yaml

---------

Co-authored-by: Bo Li <drluodian@gmail.com>
Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
* Merge commit '95f3d3e116db32b49631f2005c9b2a608f778cc0'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit 'bfdf75d7b67680cdc98fdf3f58458633bb492de6'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 11795cb69caaaceddf6b284f18a386c7787d476d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit fb19895ca28ecf64d2ea5322e5391f7742e540f4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit e02df3b556a9d34d32d8bfa1f99ea992b763bc6f
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 388a23ac4bb47644826869562c70c10b470a1817
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit bcb7df038402c5ef73db230126fcd76795ee69df
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 30056b56be382107f520d5c85b84c3d541d970e9
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit bfdf75d
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (EvolvingLMMs-Lab#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit f69268b
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (EvolvingLMMs-Lab#17)

    * Merge commit '95f3d3e116db32b49631f2005c9b2a608f778cc0'

    * Update dataset paths and improve user prompts

commit 53ddf3fb2716fd99b2fa454656312d6fc92227b7
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit d7bbd3b2cbd78fdc3df2137ac0d625b5f5505acc
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 11795cb69caaaceddf6b284f18a386c7787d476d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit fb19895ca28ecf64d2ea5322e5391f7742e540f4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit e02df3b556a9d34d32d8bfa1f99ea992b763bc6f
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 388a23ac4bb47644826869562c70c10b470a1817
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit bcb7df038402c5ef73db230126fcd76795ee69df
Merge: 7e8b57d 1d3fdd4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 30056b56be382107f520d5c85b84c3d541d970e9
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 53ddf3fb2716fd99b2fa454656312d6fc92227b7
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit d7bbd3b2cbd78fdc3df2137ac0d625b5f5505acc
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 19db53b
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit a2cc9303dc72e4d53983bb56e54a32e977c3e270
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit 35e87e7c7a480d005abf607c2527a35457d92311
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 89755323596b85208ed33aa88c296604a39af6eb
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit 19db53b
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 0b0d30dfb247c5f0b7b68398b9e9fcde74cf7fa2
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit e273f9cbd91540df86bdbc652bff88a847bd0d2d
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit e84126aaaf8a07bd371a0571a914ccbcd3697f20
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 110deab53dc1a2fd349b1872cd261b69074c5fa8
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit 0fa3e0c40075997ea80ed976bdee9615f17d3ece
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit 2aaca579120def99860f90054233f3358950fa66
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit 19db53b
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (EvolvingLMMs-Lab#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Update lmms_eval/evaluator.py and lmms_eval/tasks/vizwizvqa/utils.py

* vizwiz-val

* Update utils.py

* Update vizwizvqa.yaml

---------

Co-authored-by: Bo Li <drluodian@gmail.com>
Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
* Add COCO, RefCOCO, RefCOCO+, RefCOCOg (#5)

* Update author name and email in pyproject.toml

* add mmvet and try to modify llava arch

* Add coco, refcoco support

* Fix doc_to_visual error

* Fix segmentation mask error

* Add refcoco+, refcocog

* Remove debug code

* black lint

* Remove unused code and scripts

* Fix group stderr N/A error between str and int

* Fix letter case issue

* Update lmms_eval tasks and utils

* Fix coco test_split name

* Add llava-bench-in-the-wild support

* Black codestyle, lint

* Add COCO evaluation metric

* Add refcoco, refcocog, refcoco+ evaluation kit

* Add llava bench coco support

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* VQAv2 eval (#4)

* vqav2

* Add vqav2_process_results function and update vqav2_doc_to_text function

* Implement vqav2_process_results function to return exact match score

* Refactor fewshot_docs() to use config.fewshot_config

* Refactor Task class to handle fewshot_docs when training and validation docs are not available

* Add answer processing logic in vqav2_process_results function

* Refactor vqav2_process_results function and add submission aggregation

* Add vqav2_aggreate_submissions function to utils.py

* textvqa

* Refactor answer processing in textvqa_process_results() function

* textvqa eval

* Update dataset path and modify textvqa_doc_to_text function

* Capitalize the question in textvqa_doc_to_text function

* Update textvqa.yaml and utils.py

* Fix formatting issues in lmms_eval/api/task.py, lmms_eval/tasks/gqa/utils.py, lmms_eval/tasks/textvqa/utils.py, and lmms_eval/tasks/vqav2/utils.py

---------

Co-authored-by: Li Bo <drluodian@gmail.com>

* [Big Changes] add LLaVA-1.6, MMVet, LLaVA-W, POPE, and many other changes on logs, model args. (#7)

* Update author name and email in pyproject.toml

* add mmvet and try to modify llava arch

* black lint

* Remove unused code and scripts

* Update lmms_eval tasks and utils

* Update LMMS-Eval dependencies and configurations

* Squashed commit of the following:

commit 209f3904f33210bec0b4b146e96fcbd67a4e1541
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Wed Jan 17 20:27:13 2024 +0800

    Add COCO, RefCOCO, RefCOCO+, RefCOCOg (#5)

    * Update author name and email in pyproject.toml

    * add mmvet and try to modify llava arch

    * Add coco, refcoco support

    * Fix doc_to_visual error

    * Fix segmentation mask error

    * Add refcoco+, refcocog

    * Remove debug code

    * black lint

    * Remove unused code and scripts

    * Fix group stderr N/A error between str and int

    * Fix letter case issue

    * Update lmms_eval tasks and utils

    * Fix coco test_split name

    * Add llava-bench-in-the-wild support

    * Black codestyle, lint

    * Add COCO evaluation metric

    * Add refcoco, refcocog, refcoco+ evaluation kit

    * Add llava bench coco support

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

commit f102f038a161fe667628accd2d9daa33e70fe74f
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Wed Jan 17 20:26:58 2024 +0800

    Update utils.py (#6)

* Fix logging issue and remove unnecessary whitespace

* Add openai and pycocoevalcap dependencies

* Fix device mapping issue in Llava constructor

* Add support for truncating context in generation

* Update Llava model and evaluation configuration

* Update YAML configuration files

* Update YAML configuration files

* add otterhd and gemini models

* Add support for custom image aspect ratio in Llava model

* Add dataset_kwargs and max_gen_toks to YAML files

* Fix log_samples suffix typo and use hash for output name

* Refactor LMMS evaluation code and update LLAVA model properties

* matched response for mistral-llava

* Refactor logging in llava_aggregation function

* Print evaluation statistics instead of logging them

* Fix logging information in llava_aggregation function

* Add new models and dataset_kwargs for COCO tasks

* Update truncate_context parameter in Llava class constructor

* Update dataset_kwargs in YAML files

* Remove issue type tags from issue and pull request templates

* add mmvet and try to modify llava arch

* black lint

* Update lmms_eval tasks and utils

* Update LMMS-Eval dependencies and configurations

* Squashed commit of the following:

commit 209f3904f33210bec0b4b146e96fcbd67a4e1541
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Wed Jan 17 20:27:13 2024 +0800

    Add COCO, RefCOCO, RefCOCO+, RefCOCOg (#5)

    * Update author name and email in pyproject.toml

    * add mmvet and try to modify llava arch

    * Add coco, refcoco support

    * Fix doc_to_visual error

    * Fix segmentation mask error

    * Add refcoco+, refcocog

    * Remove debug code

    * black lint

    * Remove unused code and scripts

    * Fix group stderr N/A error between str and int

    * Fix letter case issue

    * Update lmms_eval tasks and utils

    * Fix coco test_split name

    * Add llava-bench-in-the-wild support

    * Black codestyle, lint

    * Add COCO evaluation metric

    * Add refcoco, refcocog, refcoco+ evaluation kit

    * Add llava bench coco support

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

commit f102f038a161fe667628accd2d9daa33e70fe74f
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Wed Jan 17 20:26:58 2024 +0800

    Update utils.py (#6)

* Fix logging issue and remove unnecessary whitespace

* Add openai and pycocoevalcap dependencies

* Fix device mapping issue in Llava constructor

* Add support for truncating context in generation

* Update Llava model and evaluation configuration

* Update YAML configuration files

* Update YAML configuration files

* add otterhd and gemini models

* Add support for custom image aspect ratio in Llava model

* Add dataset_kwargs and max_gen_toks to YAML files

* Fix log_samples suffix typo and use hash for output name

* Refactor LMMS evaluation code and update LLAVA model properties

* matched response for mistral-llava

* Refactor logging in llava_aggregation function

* Print evaluation statistics instead of logging them

* Fix logging information in llava_aggregation function

* Add new models and dataset_kwargs for COCO tasks

* Update truncate_context parameter in Llava class constructor

* Update dataset_kwargs in YAML files

* Remove issue type tags from issue and pull request templates

* Refactor pope utils functions

* Update transformers dependency to version 4.36.2

* Revise llava-in-the-wild prompt for align

* Add default values for gen_kwargs in Llava class

* Fix formatting issues and import pdb for debugging

* Remove pdb.set_trace() and update default value for max_new_tokens

* Add llava loglikelihood

* Fix formatting and indentation issues in lmms_eval/api/metrics.py and lmms_eval/models/llava.py

* Update function to handle edge cases

This commit updates the function to handle edge cases, improving the overall reliability and robustness of the code.

* Update black version in pre-commit config

* Remove duplicate lines in gqa

* Another way to solve memory issue

* Handle exception in model generation

* Refactor pope_aggregate_results to use "score" key instead of "pope_accuracy"

* Update pope metrics aggregation functions

* Add model_to_prompt in pope.yaml

* Update pope.yaml configuration

* Refactor code to simplify construct_requests call

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>

* Add datetime to output name in cli_evaluate function

Add get_datetime_str function to utils.py

* Refactor pope_aggregate_f1_score function

* Fix datetime format in get_datetime_str function

* Update JSON dump indentation in cli_evaluate function

* Add datetime to output name in cli_evaluate function (#10)

* Revert "Add datetime to output name in cli_evaluate function"

This reverts commit ef26f78c46b50d8769a4fb6990b909162c2881c3.

* Add datetime to output name in cli_evaluate function

* [Datasets] Added POPE and Aligned. (#11)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* [Dataset] Add SEED-Bench, TextCaps, NoCaps (#12)

* Change coco from print to logger

* Add llava loglikelihood

* Add Nocaps support

* Fix pass through function

* Add textcaps support

* Fix textcaps eval image_id

* Add seedbench support

* Add seedbench ppl evaluation

* black lint

* [Datasets] Add four internal evaluation datasets (#13)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* Remove unused variable in mmvet_process_results function

* Remove unused imports in utils.py

* Refactor get_chat_response function to include retries for API requests

* Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

* Update prompt variable in lmms_eval tasks

* Refactor output_name variable in cli_evaluate function

* Fix logging message in mmvet_process_results function

* Update sleep time in get_chat_response function

* Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

* Refactor get_eval function to include retries

* Add token parameter to load_dataset function in gqa_doc_to_visual

* Refactor llava_process_results and llava_aggregation functions

* [Datasets] Add four internal evaluation datasets (#13)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* Remove unused variable in mmvet_process_results function

* Remove unused imports in utils.py

* Refactor get_chat_response function to include retries for API requests

* Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

* Update prompt variable in lmms_eval tasks

* Refactor output_name variable in cli_evaluate function

* Fix logging message in mmvet_process_results function

* Update sleep time in get_chat_response function

* Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

* Refactor get_eval function to include retries

* Add token parameter to load_dataset function in gqa_doc_to_visual

* Refactor llava_process_results and llava_aggregation functions

* add mmmu (#15)

* add mmme

* black

* add mmmu (#15)

* add mmme

* black

* [Memory issue] Solve memory issue for building context (#14)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* Remove unused variable in mmvet_process_results function

* Remove unused imports in utils.py

* Refactor get_chat_response function to include retries for API requests

* Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

* Update prompt variable in lmms_eval tasks

* Refactor output_name variable in cli_evaluate function

* Fix logging message in mmvet_process_results function

* Update sleep time in get_chat_response function

* Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

* Refactor get_eval function to include retries

* Add token parameter to load_dataset function in gqa_doc_to_visual

* Refactor llava_process_results and llava_aggregation functions

* Remove unused function llava_aggregation

* Refractor llava-bench aggregation code

* Add logs and scripts to .gitignore, and set image_aspect_ratio to original in scienceqa.yaml

* Update generation parameters in scienceqa.yaml

* Solve memory issue for building context

* Solved gather result error

* Update lmms_eval scienceqa_img config

* Fixed nocaps store results

* Revise seedbench prompt

* Squashed commit of the following:

commit 290126e6a269db4cca9b3544bd017d6c17012793
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Wed Jan 24 14:07:36 2024 +0800

    add mmmu (#15)

    * add mmme

    * black

commit 8b0227cd7b2602d096d773a01b2199d1f4110f22
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 10:00:33 2024 +0800

    [Datasets] Add four internal evaluation datasets (#13)

    * Update generation_kwargs in pope.yaml

    * Update pope_doc_to_text function

    * Remove unused variable in mmvet_process_results function

    * Remove unused imports in utils.py

    * Refactor get_chat_response function to include retries for API requests

    * Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

    * Update prompt variable in lmms_eval tasks

    * Refactor output_name variable in cli_evaluate function

    * Fix logging message in mmvet_process_results function

    * Update sleep time in get_chat_response function

    * Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

    * Refactor get_eval function to include retries

    * Add token parameter to load_dataset function in gqa_doc_to_visual

    * Refactor llava_process_results and llava_aggregation functions

commit fec494dbe5971e8fa5a886b191a4781be3ce7a6f
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Tue Jan 23 19:17:40 2024 +0800

    [Dataset] Add SEED-Bench, TextCaps, NoCaps (#12)

    * Change coco from print to logger

    * Add llava loglikelihood

    * Add Nocaps support

    * Fix pass through function

    * Add textcaps support

    * Fix textcaps eval image_id

    * Add seedbench support

    * Add seedbench ppl evaluation

    * black lint

commit 4c3c2c63a681f29c537c2467957de1a90568748d
Author: Li Bo <drluodian@gmail.com>
Date:   Tue Jan 23 19:17:12 2024 +0800

    [Datasets] Added POPE and Aligned. (#11)

    * Update generation_kwargs in pope.yaml

    * Update pope_doc_to_text function

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* [Memory issue] Solve memory issue for building context (#14)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* Remove unused variable in mmvet_process_results function

* Remove unused imports in utils.py

* Refactor get_chat_response function to include retries for API requests

* Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

* Update prompt variable in lmms_eval tasks

* Refactor output_name variable in cli_evaluate function

* Fix logging message in mmvet_process_results function

* Update sleep time in get_chat_response function

* Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

* Refactor get_eval function to include retries

* Add token parameter to load_dataset function in gqa_doc_to_visual

* Refactor llava_process_results and llava_aggregation functions

* Remove unused function llava_aggregation

* Refractor llava-bench aggregation code

* Add logs and scripts to .gitignore, and set image_aspect_ratio to original in scienceqa.yaml

* Update generation parameters in scienceqa.yaml

* Solve memory issue for building context

* Solved gather result error

* Update lmms_eval scienceqa_img config

* Fixed nocaps store results

* Revise seedbench prompt

* Squashed commit of the following:

commit c3cc24a89415aeccad31ccbb10642af677cd6fe5
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Wed Jan 24 14:07:36 2024 +0800

    add mmmu (#15)

    * add mmme

    * black

commit 0dbc5d16c4f45ebea8def5f0bc1a36fcd93f9a05
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 10:00:33 2024 +0800

    [Datasets] Add four internal evaluation datasets (#13)

    * Update generation_kwargs in pope.yaml

    * Update pope_doc_to_text function

    * Remove unused variable in mmvet_process_results function

    * Remove unused imports in utils.py

    * Refactor get_chat_response function to include retries for API requests

    * Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

    * Update prompt variable in lmms_eval tasks

    * Refactor output_name variable in cli_evaluate function

    * Fix logging message in mmvet_process_results function

    * Update sleep time in get_chat_response function

    * Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

    * Refactor get_eval function to include retries

    * Add token parameter to load_dataset function in gqa_doc_to_visual

    * Refactor llava_process_results and llava_aggregation functions

commit fec494dbe5971e8fa5a886b191a4781be3ce7a6f
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Tue Jan 23 19:17:40 2024 +0800

    [Dataset] Add SEED-Bench, TextCaps, NoCaps (#12)

    * Change coco from print to logger

    * Add llava loglikelihood

    * Add Nocaps support

    * Fix pass through function

    * Add textcaps support

    * Fix textcaps eval image_id

    * Add seedbench support

    * Add seedbench ppl evaluation

    * black lint

commit 4c3c2c63a681f29c537c2467957de1a90568748d
Author: Li Bo <drluodian@gmail.com>
Date:   Tue Jan 23 19:17:12 2024 +0800

    [Datasets] Added POPE and Aligned. (#11)

    * Update generation_kwargs in pope.yaml

    * Update pope_doc_to_text function

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* Add output path file naming convention (#16)

Update datetime format in get_datetime_str() function

* Add output path file naming convention (#16)

Update datetime format in get_datetime_str() function

* [Datasets] modify NoCaps data path and prompts (#17)

* Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

* Update dataset paths and improve user prompts

* [Datasets] modify NoCaps data path and prompts (#17)

* Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

* Update dataset paths and improve user prompts

* [Dataset] Add flickr30k (#18)

* Add flickr30k support

* Black lint

* Align prompt with NoCaps

* [Dataset] Add flickr30k (#18)

* Add flickr30k support

* Black lint

* Align prompt with NoCaps

* add model specific prompt and gen kwargs in sqa (#19)

* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* add model specific prompt and gen kwargs in sqa (#19)

* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* Dev/add chartqa and ai2d (#23)

* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* add chartqa

* black

* add ai2d

* black

* update chartqa

* blacl

* update ai2d dataset

* black

* Add 'submissions/' directory to .gitignore

* Add Python setup and Black version installation workflow
Refactor ContextSampler class in samplers.py
Remove unnecessary line in DecontaminationFilter class
Update dependencies in pyproject.toml

* Refactor code in ContextSampler class

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* Dev/add chartqa and ai2d (#23)

* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* add chartqa

* black

* add ai2d

* black

* update chartqa

* blacl

* update ai2d dataset

* black

* Add 'submissions/' directory to .gitignore

* Add Python setup and Black version installation workflow
Refactor ContextSampler class in samplers.py
Remove unnecessary line in DecontaminationFilter class
Update dependencies in pyproject.toml

* Refactor code in ContextSampler class

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* [Datasets] Changes for Flickr30K and NoCaps, also merged Peiyuan's Model Specific Prompt. (#20)

* Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 2626383d99b5eac59d531ca0f293df960570c524
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
Merge: 7e8b57d 4d11dce
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 842fbc6f2da7d9a118adf9ec27c3d8542d74168e
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4bf0504fabc3b62f356c467b2fd1119083d27313
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

    * Update dataset paths and improve user prompts

commit 520c7a2cafe60810aca79df814ce6829d4576032
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 2626383d99b5eac59d531ca0f293df960570c524
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
Merge: 7e8b57d 4d11dce
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 520c7a2cafe60810aca79df814ce6829d4576032
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit a2cc9303dc72e4d53983bb56e54a32e977c3e270
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit 35e87e7c7a480d005abf607c2527a35457d92311
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 89755323596b85208ed33aa88c296604a39af6eb
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 0b0d30dfb247c5f0b7b68398b9e9fcde74cf7fa2
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit e273f9cbd91540df86bdbc652bff88a847bd0d2d
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit e84126aaaf8a07bd371a0571a914ccbcd3697f20
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 110deab53dc1a2fd349b1872cd261b69074c5fa8
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit 0fa3e0c40075997ea80ed976bdee9615f17d3ece
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit 2aaca579120def99860f90054233f3358950fa66
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Add download configuration for dataset

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* add datasets

* Update gpt_eval_model_name in mathvista.yaml

* Merge commit '817eb057bcb61226b33d3ac3c8def01c36c90f96'

* Update pyproject.toml with dependencies and URLs

* Squashed commit of the following:

commit f253968ad703f682a29317bdd51ec6c1fd7c5465
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Sat Jan 27 13:56:37 2024 +0800

    Dev/add chartqa and ai2d (#23)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * Add 'submissions/' directory to .gitignore

    * Add Python setup and Black version installation workflow
    Refactor ContextSampler class in samplers.py
    Remove unnecessary line in DecontaminationFilter class
    Update dependencies in pyproject.toml

    * Refactor code in ContextSampler class

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

* Refactor image processing and submission file path

* Refactor directory creation logic in cli_evaluate_single function

* Update dataset path and test split in vqav2.yaml

* Remove "total" column from cap_details_columns DataFrame

* Add retry logic for dataset download

* Add 'tenacity' to dependencies in pyproject.toml

* Refactor code in ContextSampler class

* Update Black version and configuration, and improve code readability in ContextSampler

* Update Black version and line length

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
Co-authored-by: Fanyi Pu <FPU001@e.ntu.edu.sg>

* [Datasets] Changes for Flickr30K and NoCaps, also merged Peiyuan's Model Specific Prompt. (#20)

* Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 4d11dce
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit ecb47d73d6e000b472be6c5c0cdc9413c7734384
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit dc23f4b42b1dd60b41904d7ddbee1412d6851077
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

    * Update dataset paths and improve user prompts

commit 5f55126484a7c9325db586d26cf2052538222804
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 4d11dce
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 5f55126484a7c9325db586d26cf2052538222804
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 18e984cfe173390843c73048a931baa17800f918
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit 0df825c9e72a06e6acb4c0bd43c2083ffe8b74c0
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit b9d9f9896993033b92346e9f47420c55b866c715
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 4256bef410e4c8d8761e0cd0d79ac5e57b97651b
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit 18e984cfe173390843c73048a931baa17800f918
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 0c8a3919885b8fe2880bb2892f7a619d060012d1
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit d2bc7c92ac61179b8c4031e11bc31970355252f6
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit c78fa29cd0d161641ee05db57bd39314b998c8c7
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 397f0906968fd8ba04b883469b96217737c43e09
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit 52a7ea6c7599adeec2ac2787f500e215ce47cf79
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit f706b2aaf9b288c582611191a1841b58feaeb741
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit 18e984cfe173390843c73048a931baa17800f918
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Add download configuration for dataset

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* add datasets

* Update gpt_eval_model_name in mathvista.yaml

* Merge commit '0d620f98b49f8204d02633f209eedd5d8b7a1f7c'

* Update pyproject.toml with dependencies and URLs

* Squashed commit of the following:

commit 8b600f55b6cf5627504c407871539db59f6085a3
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Sat Jan 27 13:56:37 2024 +0800

    Dev/add chartqa and ai2d (#23)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * Add 'submissions/' directory to .gitignore

    * Add Python setup and Black version installation workflow
    Refactor ContextSampler class in samplers.py
    Remove unnecessary line in DecontaminationFilter class
    Update dependencies in pyproject.toml

    * Refactor code in ContextSampler class

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

* Refactor image processing and submission file path

* Refactor directory creation logic in cli_evaluate_single function

* Update dataset path and test split in vqav2.yaml

* Remove "total" column from cap_details_columns DataFrame

* Add retry logic for dataset download

* Add 'tenacity' to dependencies in pyproject.toml

* Refactor code in ContextSampler class

* Update Black version and configuration, and improve code readability in ContextSampler

* Update Black version and line length

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
Co-authored-by: Fanyi Pu <FPU001@e.ntu.edu.sg>

* vqav2 (#25)

* Update tqdm progress bar position

* Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

* Squashed commit of the following:

commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

commit 842fbc6f2da7d9a118adf9ec27c3d8542d74168e
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4bf0504fabc3b62f356c467b2fd1119083d27313
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

    * Update dataset paths and improve user prompts

commit 767f7e2cae60cf67ec5878234d84321395a3ed15
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* remove useless output file

* Update dataset path in vqav2.yaml

* Squashed commit of the following:

commit eeb2b9827502f044ef67d8440f53124baf219ba3
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:56:45 2024 +0800

    Black lint

commit 1ce9f0b37e4bc5e6ff5fbfcd23fd339eb14974ae
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:47 2024 +0800

    Solve doc_iterator_for_counting crashing issue

commit e12b3bb41ed4f51540cfac84e5e96d15777540c4
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:13 2024 +0800

    Exclude train in refcoco/+/g config

commit 42c56f82bc4ccae12e19e76d09d7e525ca9ef2f4
Merge: 6a1ae69 697a438
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:13 2024 +0000

    Merge branch 'dev/bli_add_datasets' of https://github.com/EvolvingLMMs-Lab/lmms-eval into dev/bli_add_datasets

commit aed08303fe87808986d206540a0c0ee6d8764988
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:06 2024 +0000

    Fix file path and raise error if config file does not exist

commit a105386613c443d9e740c89725cbd1281bbdfef6
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 00:47:24 2024 +0800

    Fix tasks issue for nocaps, refcoco/+/g

commit 21c8119e377760f44c769bed2528d863a8f4333b
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 10:09:43 2024 +0000

    Remove unused files and update task configuration

commit 0ccb2629c2aacdb297b7cf0c9c2bcfa386bb7582
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:43:56 2024 +0000

    Add submission file for coco, flickr30k, nocaps, and textcaps tasks

commit 5365e13e93c702a1e0e259ee6a08d6a427d72470
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:32:54 2024 +0000

    Refactor get_task_dict function to handle nested groups

commit 6773348c807bcfa1b09ceffc90c75e15cad908f7
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:13:46 2024 +0000

    Fix bug in login functionality

    Refactor code for better performance

    Add new feature for user authentication

    Update UI layout for improved user experience

    Fix typo in variable name

    Optimize database queries for faster response time

    Add error handling for edge cases

    Update dependencies to latest versions

    Remove unused code

    Improve code readability and maintainability

commit 31140f9c87dea89ca94c94bc850e3a8d43e5f8b4
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 17:07:20 2024 +0800

    Fix cli itself can not run with config file

commit df1bad47f6ed13f94848d2bee29b28e00c2384b2
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:09:04 2024 +0000

    Squashed commit of the following:

    commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
    Author: Zhang Peiyuan <a1286225768@gmail.com>
    Date:   Thu Jan 25 17:08:25 2024 +0800

        add model specific prompt and gen kwargs in sqa (#19)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

commit 06383aa4a5ff59db52fc8d584f3086efd88b7e74
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:02:57 2024 +0000

    Squashed commit of the following:

    commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit 2626383d99b5eac59d531ca0f293df960570c524
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
    Merge: 7e8b57d 4d11dce
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 520c7a2cafe60810aca79df814ce6829d4576032
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit 7a71fd6022ee5985100dda38b94956595cec77a5
Merge: 22c3adf 4d11dce
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:43:15 2024 +0000

    Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e' into dev/bli_add_datasets

commit 6870cba13cb54976480c1d5e8d97602c246f881b
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:52 2024 +0000

    Squashed commit of the following:

    commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit 2626383d99b5eac59d531ca0f293df960570c524
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
    Merge: 7e8b57d 4d11dce
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 842fbc6f2da7d9a118adf9ec27c3d8542d74168e
    Author: kcz358 <92624596+kcz358@users.noreply.github.com>
    Date:   Thu Jan 25 09:47:31 2024 +0800

        [Dataset] Add flickr30k (#18)

        * Add flickr30k support

        * Black lint

        * Align prompt with NoCaps

    commit 4bf0504fabc3b62f356c467b2fd1119083d27313
    Author: Li Bo <drluodian@gmail.com>
    Date:   Wed Jan 24 22:10:14 2024 +0800

        [Datasets] modify NoCaps data path and prompts (#17)

        * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

        * Update dataset paths and improve user prompts

    commit 520c7a2cafe60810aca79df814ce6829d4576032
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit b40d522b6bf483ebdfbf5facd4573de0cf8a93f6
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:11 2024 +0000

    Add coco_val and coco_test tasks to coco.yaml

commit 5bf643f73d06f1e540897b753450352bb92fd9ec
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 04:58:28 2024 +0000

    Update dataset_path in flickr30k.yaml

commit 95f110f0eef5196205bc501367e3642c57cc7a17
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:12:25 2024 +0000

    Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

commit c844ae49b18c1334711832208b0359c9439fe1c0
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:10:18 2024 +0000

    Add submission folder and update file paths for storing prediction results

commit 842fbc6f2da7d9a118adf9ec27c3d8542d74168e
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4bf0504fabc3b62f356c467b2fd1119083d27313
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

    * Update dataset paths and improve user prompts

commit f0446227f0dd93651e9d6c06254bbf5212ede2dd
Merge: c6370bf a0b87f5
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:07 2024 +0800

    Merge branch 'main' into dev/bli_add_datasets

commit 1e1f6cfccba758dc606fa4217102518fab73c936
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 14:08:06 2024 +0000

    Update dataset paths and improve user prompts

commit 966933754b9e5179995b3ab41d746603e13e75c6
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 11:52:33 2024 +0000

    Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

commit 767f7e2cae60cf67ec5878234d84321395a3ed15
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* Fix bug in login functionality

* create vqav2_val

* Update vqav2_test.yaml

* Update vqav2_test.yaml

* Update vqav2_val.yaml

---------

Co-authored-by: Li Bo <drluodian@gmail.com>

* vqav2 (#25)

* Update tqdm progress bar position

* Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384'

* Squashed commit of the following:

commit 18e984cfe173390843c73048a931baa17800f918
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

commit ecb47d73d6e000b472be6c5c0cdc9413c7734384
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit dc23f4b42b1dd60b41904d7ddbee1412d6851077
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

    * Update dataset paths and improve user prompts

commit 95ef3ea519cbd772924f9a6afa5394979eb00432
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* remove useless output file

* Update dataset path in vqav2.yaml

* Squashed commit of the following:

commit 75bb7043ea5a533ab6351fc0f5ab055e86106423
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:56:45 2024 +0800

    Black lint

commit 6635a8aa34cfbd3c7a4afb6fcd214a7283ce01cb
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:47 2024 +0800

    Solve doc_iterator_for_counting crashing issue

commit 080f42b88ea8acacd527b8d67b84ba1d7d135b03
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:13 2024 +0800

    Exclude train in refcoco/+/g config

commit 4da84069c08c95e49e8ab0e64a1e103ff7ac8730
Merge: 6a1ae69 697a438
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:13 2024 +0000

    Merge branch 'dev/bli_add_datasets' of https://github.com/EvolvingLMMs-Lab/lmms-eval into dev/bli_add_datasets

commit 6a1ae69923d79ae32a001edac38206b605274ec3
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:06 2024 +0000

    Fix file path and raise error if config file does not exist

commit 697a4387827ceeec3e393237dd1baa217c714c88
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 00:47:24 2024 +0800

    Fix tasks issue for nocaps, refcoco/+/g

commit 47e40437126d39a5f062c9a33b4de426c1a29804
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 10:09:43 2024 +0000

    Remove unused files and update task configuration

commit 9976eb8e9ed03c8613725fdbd822ef5d8cf70e47
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:43:56 2024 +0000

    Add submission file for coco, flickr30k, nocaps, and textcaps tasks

commit 95f97a69faa6129676e89eee14960fcfe2076b7c
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:32:54 2024 +0000

    Refactor get_task_dict function to handle nested groups

commit 3b79ee842b2488714baf92ab34528ef77989d392
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:13:46 2024 +0000

    Fix bug in login functionality

    Refactor code for better performance

    Add new feature for user authentication

    Update UI layout for improved user experience

    Fix typo in variable name

    Optimize database queries for faster response time

    Add error handling for edge cases

    Update dependencies to latest versions

    Remove unused code

    Improve code readability and maintainability

commit f5c353f2ce93a2d96add4312b695b57432f68cbb
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 17:07:20 2024 +0800

    Fix cli itself can not run with config file

commit 9a68fec37be74cfe8d4a73390bc83edee147ae24
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:09:04 2024 +0000

    Squashed commit of the following:

    commit 18e984cfe173390843c73048a931baa17800f918
    Author: Zhang Peiyuan <a1286225768@gmail.com>
    Date:   Thu Jan 25 17:08:25 2024 +0800

        add model specific prompt and gen kwargs in sqa (#19)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

commit 93f847c5851fd246716367935d6b807b17d53949
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:02:57 2024 +0000

    Squashed commit of the following:

    commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit 63739fc6fa0a462d807ae81de0db0173102de584
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit edcc752f97ea3845cefad56624e5d2855066f680
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
    Merge: 7e8b57d 4d11dce
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 5f55126484a7c9325db586d26cf2052538222804
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit fa4ad4404e26d8924f55208746dbb9143b464011
Merge: 22c3adf 4d11dce
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:43:15 2024 +0000

    Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384' into dev/bli_add_datasets

commit 22c3adfd0645acc23b6d7c06b487f4ffd47666c4
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:52 2024 +0000

    Squashed commit of the following:

    commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit 63739fc6fa0a462d807ae81de0db0173102de584
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit edcc752f97ea3845cefad56624e5d2855066f680
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
    Merge: 7e8b57d 4d11dce
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit ecb47d73d6e000b472be6c5c0cdc9413c7734384
    Author: kcz358 <92624596+kcz358@users.noreply.github.com>
    Date:   Thu Jan 25 09:47:31 2024 +0800

        [Dataset] Add flickr30k (#18)

        * Add flickr30k support

        * Black lint

        * Align prompt with NoCaps

    commit dc23f4b42b1dd60b41904d7ddbee1412d6851077
    Author: Li Bo <drluodian@gmail.com>
    Date:   Wed Jan 24 22:10:14 2024 +0800

        [Datasets] modify NoCaps data path and prompts (#17)

        * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

        * Update dataset paths and improve user prompts

    commit 5f55126484a7c9325db586d26cf2052538222804
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit 4c712336b6f7438e717a865910bb241e413a4688
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:11 2024 +0000

    Add coco_val and coco_test tasks to coco.yaml

commit b5547126c855927fd4dc8384211e4aceee40870f
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 04:58:28 2024 +0000

    Update dataset_path in flickr30k.yaml

commit f786f61e2559f082072f21aa9030e2080ddaf809
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:12:25 2024 +0000

    Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384'

commit 796a011000e0df90f66f8e80cb34dc2318ae9ac8
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:10:18 2024 +0000

    Add submission folder and update file paths for storing prediction results

commit ecb47d73d6e000b472be6c5c0cdc9413c7734384
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit dc23f4b42b1dd60b41904d7ddbee1412d6851077
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

    * Update dataset paths and improve user prompts

commit 118744c63eb2d9724571d85fbbd85fcc9ad05b59
Merge: c6370bf a0b87f5
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:07 2024 +0800

    Merge branch 'main' into dev/bli_add_datasets

commit c6370bff65903681f00cf3d07111d8e15a57b619
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 14:08:06 2024 +0000

    Update dataset paths and improve user prompts

commit 810daf458fa94cb3ec2b4a6cc5ecb1e656a24002
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 11:52:33 2024 +0000

    Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

commit 95ef3ea519cbd772924f9a6afa5394979eb00432
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* Fix bug in login functionality

* create vqav2_val

* Update vqav2_test.yaml

* Update vqav2_test.yaml

* Update vqav2_val.yaml

---------

Co-authored-by: Li Bo <drluodian@gmail.com>

* vizwiz dataset (#24)

* Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 2626383d99b5eac59d531ca0f293df960570c524
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
Merge: 7e8b57d 4d11dce
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit d4e8e2552d407…
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
* Add COCO, RefCOCO, RefCOCO+, RefCOCOg (#5)

* Update author name and email in pyproject.toml

* add mmvet and try to modify llava arch

* Add coco, refcoco support

* Fix doc_to_visual error

* Fix segmentation mask error

* Add refcoco+, refcocog

* Remove debug code

* black lint

* Remove unused code and scripts

* Fix group stderr N/A error between str and int

* Fix letter case issue

* Update lmms_eval tasks and utils

* Fix coco test_split name

* Add llava-bench-in-the-wild support

* Black codestyle, lint

* Add COCO evaluation metric

* Add refcoco, refcocog, refcoco+ evaluation kit

* Add llava bench coco support

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* VQAv2 eval (#4)

* vqav2

* Add vqav2_process_results function and update vqav2_doc_to_text function

* Implement vqav2_process_results function to return exact match score

* Refactor fewshot_docs() to use config.fewshot_config

* Refactor Task class to handle fewshot_docs when training and validation docs are not available

* Add answer processing logic in vqav2_process_results function

* Refactor vqav2_process_results function and add submission aggregation

* Add vqav2_aggreate_submissions function to utils.py

* textvqa

* Refactor answer processing in textvqa_process_results() function

* textvqa eval

* Update dataset path and modify textvqa_doc_to_text function

* Capitalize the question in textvqa_doc_to_text function

* Update textvqa.yaml and utils.py

* Fix formatting issues in lmms_eval/api/task.py, lmms_eval/tasks/gqa/utils.py, lmms_eval/tasks/textvqa/utils.py, and lmms_eval/tasks/vqav2/utils.py

---------

Co-authored-by: Li Bo <drluodian@gmail.com>

* [Big Changes] add LLaVA-1.6, MMVet, LLaVA-W, POPE, and many other changes on logs, model args. (#7)

* Update author name and email in pyproject.toml

* add mmvet and try to modify llava arch

* black lint

* Remove unused code and scripts

* Update lmms_eval tasks and utils

* Update LMMS-Eval dependencies and configurations

* Squashed commit of the following:

commit 209f3904f33210bec0b4b146e96fcbd67a4e1541
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Wed Jan 17 20:27:13 2024 +0800

    Add COCO, RefCOCO, RefCOCO+, RefCOCOg (#5)

    * Update author name and email in pyproject.toml

    * add mmvet and try to modify llava arch

    * Add coco, refcoco support

    * Fix doc_to_visual error

    * Fix segmentation mask error

    * Add refcoco+, refcocog

    * Remove debug code

    * black lint

    * Remove unused code and scripts

    * Fix group stderr N/A error between str and int

    * Fix letter case issue

    * Update lmms_eval tasks and utils

    * Fix coco test_split name

    * Add llava-bench-in-the-wild support

    * Black codestyle, lint

    * Add COCO evaluation metric

    * Add refcoco, refcocog, refcoco+ evaluation kit

    * Add llava bench coco support

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

commit f102f038a161fe667628accd2d9daa33e70fe74f
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Wed Jan 17 20:26:58 2024 +0800

    Update utils.py (#6)

* Fix logging issue and remove unnecessary whitespace

* Add openai and pycocoevalcap dependencies

* Fix device mapping issue in Llava constructor

* Add support for truncating context in generation

* Update Llava model and evaluation configuration

* Update YAML configuration files

* Update YAML configuration files

* add otterhd and gemini models

* Add support for custom image aspect ratio in Llava model

* Add dataset_kwargs and max_gen_toks to YAML files

* Fix log_samples suffix typo and use hash for output name

* Refactor LMMS evaluation code and update LLAVA model properties

* matched response for mistral-llava

* Refactor logging in llava_aggregation function

* Print evaluation statistics instead of logging them

* Fix logging information in llava_aggregation function

* Add new models and dataset_kwargs for COCO tasks

* Update truncate_context parameter in Llava class constructor

* Update dataset_kwargs in YAML files

* Remove issue type tags from issue and pull request templates

* add mmvet and try to modify llava arch

* black lint

* Update lmms_eval tasks and utils

* Update LMMS-Eval dependencies and configurations

* Squashed commit of the following:

commit 209f3904f33210bec0b4b146e96fcbd67a4e1541
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Wed Jan 17 20:27:13 2024 +0800

    Add COCO, RefCOCO, RefCOCO+, RefCOCOg (#5)

    * Update author name and email in pyproject.toml

    * add mmvet and try to modify llava arch

    * Add coco, refcoco support

    * Fix doc_to_visual error

    * Fix segmentation mask error

    * Add refcoco+, refcocog

    * Remove debug code

    * black lint

    * Remove unused code and scripts

    * Fix group stderr N/A error between str and int

    * Fix letter case issue

    * Update lmms_eval tasks and utils

    * Fix coco test_split name

    * Add llava-bench-in-the-wild support

    * Black codestyle, lint

    * Add COCO evaluation metric

    * Add refcoco, refcocog, refcoco+ evaluation kit

    * Add llava bench coco support

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

commit f102f038a161fe667628accd2d9daa33e70fe74f
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Wed Jan 17 20:26:58 2024 +0800

    Update utils.py (#6)

* Fix logging issue and remove unnecessary whitespace

* Add openai and pycocoevalcap dependencies

* Fix device mapping issue in Llava constructor

* Add support for truncating context in generation

* Update Llava model and evaluation configuration

* Update YAML configuration files

* Update YAML configuration files

* add otterhd and gemini models

* Add support for custom image aspect ratio in Llava model

* Add dataset_kwargs and max_gen_toks to YAML files

* Fix log_samples suffix typo and use hash for output name

* Refactor LMMS evaluation code and update LLAVA model properties

* matched response for mistral-llava

* Refactor logging in llava_aggregation function

* Print evaluation statistics instead of logging them

* Fix logging information in llava_aggregation function

* Add new models and dataset_kwargs for COCO tasks

* Update truncate_context parameter in Llava class constructor

* Update dataset_kwargs in YAML files

* Remove issue type tags from issue and pull request templates

* Refactor pope utils functions

* Update transformers dependency to version 4.36.2

* Revise llava-in-the-wild prompt for align

* Add default values for gen_kwargs in Llava class

* Fix formatting issues and import pdb for debugging

* Remove pdb.set_trace() and update default value for max_new_tokens

* Add llava loglikelihood

* Fix formatting and indentation issues in lmms_eval/api/metrics.py and lmms_eval/models/llava.py

* Update function to handle edge cases

This commit updates the function to handle edge cases, improving the overall reliability and robustness of the code.

* Update black version in pre-commit config

* Remove duplicate lines in gqa

* Another way to solve memory issue

* Handle exception in model generation

* Refactor pope_aggregate_results to use "score" key instead of "pope_accuracy"

* Update pope metrics aggregation functions

* Add model_to_prompt in pope.yaml

* Update pope.yaml configuration

* Refactor code to simplify construct_requests call

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>

* Add datetime to output name in cli_evaluate function

Add get_datetime_str function to utils.py

* Refactor pope_aggregate_f1_score function

* Fix datetime format in get_datetime_str function

* Update JSON dump indentation in cli_evaluate function

* Add datetime to output name in cli_evaluate function (#10)

* Revert "Add datetime to output name in cli_evaluate function"

This reverts commit ef26f78c46b50d8769a4fb6990b909162c2881c3.

* Add datetime to output name in cli_evaluate function

* [Datasets] Added POPE and Aligned. (#11)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* [Dataset] Add SEED-Bench, TextCaps, NoCaps (#12)

* Change coco from print to logger

* Add llava loglikelihood

* Add Nocaps support

* Fix pass through function

* Add textcaps support

* Fix textcaps eval image_id

* Add seedbench support

* Add seedbench ppl evaluation

* black lint

* [Datasets] Add four internal evaluation datasets (#13)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* Remove unused variable in mmvet_process_results function

* Remove unused imports in utils.py

* Refactor get_chat_response function to include retries for API requests

* Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

* Update prompt variable in lmms_eval tasks

* Refactor output_name variable in cli_evaluate function

* Fix logging message in mmvet_process_results function

* Update sleep time in get_chat_response function

* Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

* Refactor get_eval function to include retries

* Add token parameter to load_dataset function in gqa_doc_to_visual

* Refactor llava_process_results and llava_aggregation functions

* [Datasets] Add four internal evaluation datasets (#13)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* Remove unused variable in mmvet_process_results function

* Remove unused imports in utils.py

* Refactor get_chat_response function to include retries for API requests

* Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

* Update prompt variable in lmms_eval tasks

* Refactor output_name variable in cli_evaluate function

* Fix logging message in mmvet_process_results function

* Update sleep time in get_chat_response function

* Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

* Refactor get_eval function to include retries

* Add token parameter to load_dataset function in gqa_doc_to_visual

* Refactor llava_process_results and llava_aggregation functions

* add mmmu (#15)

* add mmme

* black

* add mmmu (#15)

* add mmme

* black

* [Memory issue] Solve memory issue for building context (#14)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* Remove unused variable in mmvet_process_results function

* Remove unused imports in utils.py

* Refactor get_chat_response function to include retries for API requests

* Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

* Update prompt variable in lmms_eval tasks

* Refactor output_name variable in cli_evaluate function

* Fix logging message in mmvet_process_results function

* Update sleep time in get_chat_response function

* Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

* Refactor get_eval function to include retries

* Add token parameter to load_dataset function in gqa_doc_to_visual

* Refactor llava_process_results and llava_aggregation functions

* Remove unused function llava_aggregation

* Refractor llava-bench aggregation code

* Add logs and scripts to .gitignore, and set image_aspect_ratio to original in scienceqa.yaml

* Update generation parameters in scienceqa.yaml

* Solve memory issue for building context

* Solved gather result error

* Update lmms_eval scienceqa_img config

* Fixed nocaps store results

* Revise seedbench prompt

* Squashed commit of the following:

commit 290126e6a269db4cca9b3544bd017d6c17012793
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Wed Jan 24 14:07:36 2024 +0800

    add mmmu (#15)

    * add mmme

    * black

commit 8b0227cd7b2602d096d773a01b2199d1f4110f22
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 10:00:33 2024 +0800

    [Datasets] Add four internal evaluation datasets (#13)

    * Update generation_kwargs in pope.yaml

    * Update pope_doc_to_text function

    * Remove unused variable in mmvet_process_results function

    * Remove unused imports in utils.py

    * Refactor get_chat_response function to include retries for API requests

    * Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

    * Update prompt variable in lmms_eval tasks

    * Refactor output_name variable in cli_evaluate function

    * Fix logging message in mmvet_process_results function

    * Update sleep time in get_chat_response function

    * Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

    * Refactor get_eval function to include retries

    * Add token parameter to load_dataset function in gqa_doc_to_visual

    * Refactor llava_process_results and llava_aggregation functions

commit fec494dbe5971e8fa5a886b191a4781be3ce7a6f
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Tue Jan 23 19:17:40 2024 +0800

    [Dataset] Add SEED-Bench, TextCaps, NoCaps (#12)

    * Change coco from print to logger

    * Add llava loglikelihood

    * Add Nocaps support

    * Fix pass through function

    * Add textcaps support

    * Fix textcaps eval image_id

    * Add seedbench support

    * Add seedbench ppl evaluation

    * black lint

commit 4c3c2c63a681f29c537c2467957de1a90568748d
Author: Li Bo <drluodian@gmail.com>
Date:   Tue Jan 23 19:17:12 2024 +0800

    [Datasets] Added POPE and Aligned. (#11)

    * Update generation_kwargs in pope.yaml

    * Update pope_doc_to_text function

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* [Memory issue] Solve memory issue for building context (#14)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* Remove unused variable in mmvet_process_results function

* Remove unused imports in utils.py

* Refactor get_chat_response function to include retries for API requests

* Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

* Update prompt variable in lmms_eval tasks

* Refactor output_name variable in cli_evaluate function

* Fix logging message in mmvet_process_results function

* Update sleep time in get_chat_response function

* Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

* Refactor get_eval function to include retries

* Add token parameter to load_dataset function in gqa_doc_to_visual

* Refactor llava_process_results and llava_aggregation functions

* Remove unused function llava_aggregation

* Refractor llava-bench aggregation code

* Add logs and scripts to .gitignore, and set image_aspect_ratio to original in scienceqa.yaml

* Update generation parameters in scienceqa.yaml

* Solve memory issue for building context

* Solved gather result error

* Update lmms_eval scienceqa_img config

* Fixed nocaps store results

* Revise seedbench prompt

* Squashed commit of the following:

commit c3cc24a89415aeccad31ccbb10642af677cd6fe5
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Wed Jan 24 14:07:36 2024 +0800

    add mmmu (#15)

    * add mmme

    * black

commit 0dbc5d16c4f45ebea8def5f0bc1a36fcd93f9a05
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 10:00:33 2024 +0800

    [Datasets] Add four internal evaluation datasets (#13)

    * Update generation_kwargs in pope.yaml

    * Update pope_doc_to_text function

    * Remove unused variable in mmvet_process_results function

    * Remove unused imports in utils.py

    * Refactor get_chat_response function to include retries for API requests

    * Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

    * Update prompt variable in lmms_eval tasks

    * Refactor output_name variable in cli_evaluate function

    * Fix logging message in mmvet_process_results function

    * Update sleep time in get_chat_response function

    * Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

    * Refactor get_eval function to include retries

    * Add token parameter to load_dataset function in gqa_doc_to_visual

    * Refactor llava_process_results and llava_aggregation functions

commit fec494dbe5971e8fa5a886b191a4781be3ce7a6f
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Tue Jan 23 19:17:40 2024 +0800

    [Dataset] Add SEED-Bench, TextCaps, NoCaps (#12)

    * Change coco from print to logger

    * Add llava loglikelihood

    * Add Nocaps support

    * Fix pass through function

    * Add textcaps support

    * Fix textcaps eval image_id

    * Add seedbench support

    * Add seedbench ppl evaluation

    * black lint

commit 4c3c2c63a681f29c537c2467957de1a90568748d
Author: Li Bo <drluodian@gmail.com>
Date:   Tue Jan 23 19:17:12 2024 +0800

    [Datasets] Added POPE and Aligned. (#11)

    * Update generation_kwargs in pope.yaml

    * Update pope_doc_to_text function

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* Add output path file naming convention (#16)

Update datetime format in get_datetime_str() function

* Add output path file naming convention (#16)

Update datetime format in get_datetime_str() function

* [Datasets] modify NoCaps data path and prompts (#17)

* Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

* Update dataset paths and improve user prompts

* [Datasets] modify NoCaps data path and prompts (#17)

* Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

* Update dataset paths and improve user prompts

* [Dataset] Add flickr30k (#18)

* Add flickr30k support

* Black lint

* Align prompt with NoCaps

* [Dataset] Add flickr30k (#18)

* Add flickr30k support

* Black lint

* Align prompt with NoCaps

* add model specific prompt and gen kwargs in sqa (#19)

* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* add model specific prompt and gen kwargs in sqa (#19)

* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* Dev/add chartqa and ai2d (#23)

* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* add chartqa

* black

* add ai2d

* black

* update chartqa

* blacl

* update ai2d dataset

* black

* Add 'submissions/' directory to .gitignore

* Add Python setup and Black version installation workflow
Refactor ContextSampler class in samplers.py
Remove unnecessary line in DecontaminationFilter class
Update dependencies in pyproject.toml

* Refactor code in ContextSampler class

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* Dev/add chartqa and ai2d (#23)

* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* add chartqa

* black

* add ai2d

* black

* update chartqa

* blacl

* update ai2d dataset

* black

* Add 'submissions/' directory to .gitignore

* Add Python setup and Black version installation workflow
Refactor ContextSampler class in samplers.py
Remove unnecessary line in DecontaminationFilter class
Update dependencies in pyproject.toml

* Refactor code in ContextSampler class

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* [Datasets] Changes for Flickr30K and NoCaps, also merged Peiyuan's Model Specific Prompt. (#20)

* Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 2626383d99b5eac59d531ca0f293df960570c524
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
Merge: 7e8b57d 4d11dce
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 842fbc6f2da7d9a118adf9ec27c3d8542d74168e
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4bf0504fabc3b62f356c467b2fd1119083d27313
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

    * Update dataset paths and improve user prompts

commit 520c7a2cafe60810aca79df814ce6829d4576032
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 2626383d99b5eac59d531ca0f293df960570c524
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
Merge: 7e8b57d 4d11dce
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 520c7a2cafe60810aca79df814ce6829d4576032
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit a2cc9303dc72e4d53983bb56e54a32e977c3e270
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit 35e87e7c7a480d005abf607c2527a35457d92311
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 89755323596b85208ed33aa88c296604a39af6eb
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 0b0d30dfb247c5f0b7b68398b9e9fcde74cf7fa2
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit e273f9cbd91540df86bdbc652bff88a847bd0d2d
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit e84126aaaf8a07bd371a0571a914ccbcd3697f20
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 110deab53dc1a2fd349b1872cd261b69074c5fa8
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit 0fa3e0c40075997ea80ed976bdee9615f17d3ece
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit 2aaca579120def99860f90054233f3358950fa66
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Add download configuration for dataset

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* add datasets

* Update gpt_eval_model_name in mathvista.yaml

* Merge commit '817eb057bcb61226b33d3ac3c8def01c36c90f96'

* Update pyproject.toml with dependencies and URLs

* Squashed commit of the following:

commit f253968ad703f682a29317bdd51ec6c1fd7c5465
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Sat Jan 27 13:56:37 2024 +0800

    Dev/add chartqa and ai2d (#23)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * Add 'submissions/' directory to .gitignore

    * Add Python setup and Black version installation workflow
    Refactor ContextSampler class in samplers.py
    Remove unnecessary line in DecontaminationFilter class
    Update dependencies in pyproject.toml

    * Refactor code in ContextSampler class

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

* Refactor image processing and submission file path

* Refactor directory creation logic in cli_evaluate_single function

* Update dataset path and test split in vqav2.yaml

* Remove "total" column from cap_details_columns DataFrame

* Add retry logic for dataset download

* Add 'tenacity' to dependencies in pyproject.toml

* Refactor code in ContextSampler class

* Update Black version and configuration, and improve code readability in ContextSampler

* Update Black version and line length

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
Co-authored-by: Fanyi Pu <FPU001@e.ntu.edu.sg>

* [Datasets] Changes for Flickr30K and NoCaps, also merged Peiyuan's Model Specific Prompt. (#20)

* Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 4d11dce
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit ecb47d73d6e000b472be6c5c0cdc9413c7734384
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit dc23f4b42b1dd60b41904d7ddbee1412d6851077
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

    * Update dataset paths and improve user prompts

commit 5f55126484a7c9325db586d26cf2052538222804
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 4d11dce
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 5f55126484a7c9325db586d26cf2052538222804
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 18e984cfe173390843c73048a931baa17800f918
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit 0df825c9e72a06e6acb4c0bd43c2083ffe8b74c0
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit b9d9f9896993033b92346e9f47420c55b866c715
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 4256bef410e4c8d8761e0cd0d79ac5e57b97651b
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit 18e984cfe173390843c73048a931baa17800f918
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 0c8a3919885b8fe2880bb2892f7a619d060012d1
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit d2bc7c92ac61179b8c4031e11bc31970355252f6
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit c78fa29cd0d161641ee05db57bd39314b998c8c7
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 397f0906968fd8ba04b883469b96217737c43e09
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit 52a7ea6c7599adeec2ac2787f500e215ce47cf79
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit f706b2aaf9b288c582611191a1841b58feaeb741
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit 18e984cfe173390843c73048a931baa17800f918
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Add download configuration for dataset

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* add datasets

* Update gpt_eval_model_name in mathvista.yaml

* Merge commit '0d620f98b49f8204d02633f209eedd5d8b7a1f7c'

* Update pyproject.toml with dependencies and URLs

* Squashed commit of the following:

commit 8b600f55b6cf5627504c407871539db59f6085a3
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Sat Jan 27 13:56:37 2024 +0800

    Dev/add chartqa and ai2d (#23)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * Add 'submissions/' directory to .gitignore

    * Add Python setup and Black version installation workflow
    Refactor ContextSampler class in samplers.py
    Remove unnecessary line in DecontaminationFilter class
    Update dependencies in pyproject.toml

    * Refactor code in ContextSampler class

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

* Refactor image processing and submission file path

* Refactor directory creation logic in cli_evaluate_single function

* Update dataset path and test split in vqav2.yaml

* Remove "total" column from cap_details_columns DataFrame

* Add retry logic for dataset download

* Add 'tenacity' to dependencies in pyproject.toml

* Refactor code in ContextSampler class

* Update Black version and configuration, and improve code readability in ContextSampler

* Update Black version and line length

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
Co-authored-by: Fanyi Pu <FPU001@e.ntu.edu.sg>

* vqav2 (#25)

* Update tqdm progress bar position

* Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

* Squashed commit of the following:

commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

commit 842fbc6f2da7d9a118adf9ec27c3d8542d74168e
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4bf0504fabc3b62f356c467b2fd1119083d27313
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

    * Update dataset paths and improve user prompts

commit 767f7e2cae60cf67ec5878234d84321395a3ed15
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* remove useless output file

* Update dataset path in vqav2.yaml

* Squashed commit of the following:

commit eeb2b9827502f044ef67d8440f53124baf219ba3
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:56:45 2024 +0800

    Black lint

commit 1ce9f0b37e4bc5e6ff5fbfcd23fd339eb14974ae
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:47 2024 +0800

    Solve doc_iterator_for_counting crashing issue

commit e12b3bb41ed4f51540cfac84e5e96d15777540c4
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:13 2024 +0800

    Exclude train in refcoco/+/g config

commit 42c56f82bc4ccae12e19e76d09d7e525ca9ef2f4
Merge: 6a1ae69 697a438
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:13 2024 +0000

    Merge branch 'dev/bli_add_datasets' of https://github.com/EvolvingLMMs-Lab/lmms-eval into dev/bli_add_datasets

commit aed08303fe87808986d206540a0c0ee6d8764988
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:06 2024 +0000

    Fix file path and raise error if config file does not exist

commit a105386613c443d9e740c89725cbd1281bbdfef6
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 00:47:24 2024 +0800

    Fix tasks issue for nocaps, refcoco/+/g

commit 21c8119e377760f44c769bed2528d863a8f4333b
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 10:09:43 2024 +0000

    Remove unused files and update task configuration

commit 0ccb2629c2aacdb297b7cf0c9c2bcfa386bb7582
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:43:56 2024 +0000

    Add submission file for coco, flickr30k, nocaps, and textcaps tasks

commit 5365e13e93c702a1e0e259ee6a08d6a427d72470
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:32:54 2024 +0000

    Refactor get_task_dict function to handle nested groups

commit 6773348c807bcfa1b09ceffc90c75e15cad908f7
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:13:46 2024 +0000

    Fix bug in login functionality

    Refactor code for better performance

    Add new feature for user authentication

    Update UI layout for improved user experience

    Fix typo in variable name

    Optimize database queries for faster response time

    Add error handling for edge cases

    Update dependencies to latest versions

    Remove unused code

    Improve code readability and maintainability

commit 31140f9c87dea89ca94c94bc850e3a8d43e5f8b4
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 17:07:20 2024 +0800

    Fix cli itself can not run with config file

commit df1bad47f6ed13f94848d2bee29b28e00c2384b2
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:09:04 2024 +0000

    Squashed commit of the following:

    commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
    Author: Zhang Peiyuan <a1286225768@gmail.com>
    Date:   Thu Jan 25 17:08:25 2024 +0800

        add model specific prompt and gen kwargs in sqa (#19)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

commit 06383aa4a5ff59db52fc8d584f3086efd88b7e74
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:02:57 2024 +0000

    Squashed commit of the following:

    commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit 2626383d99b5eac59d531ca0f293df960570c524
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
    Merge: 7e8b57d 4d11dce
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 520c7a2cafe60810aca79df814ce6829d4576032
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit 7a71fd6022ee5985100dda38b94956595cec77a5
Merge: 22c3adf 4d11dce
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:43:15 2024 +0000

    Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e' into dev/bli_add_datasets

commit 6870cba13cb54976480c1d5e8d97602c246f881b
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:52 2024 +0000

    Squashed commit of the following:

    commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit 2626383d99b5eac59d531ca0f293df960570c524
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
    Merge: 7e8b57d 4d11dce
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 842fbc6f2da7d9a118adf9ec27c3d8542d74168e
    Author: kcz358 <92624596+kcz358@users.noreply.github.com>
    Date:   Thu Jan 25 09:47:31 2024 +0800

        [Dataset] Add flickr30k (#18)

        * Add flickr30k support

        * Black lint

        * Align prompt with NoCaps

    commit 4bf0504fabc3b62f356c467b2fd1119083d27313
    Author: Li Bo <drluodian@gmail.com>
    Date:   Wed Jan 24 22:10:14 2024 +0800

        [Datasets] modify NoCaps data path and prompts (#17)

        * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

        * Update dataset paths and improve user prompts

    commit 520c7a2cafe60810aca79df814ce6829d4576032
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit b40d522b6bf483ebdfbf5facd4573de0cf8a93f6
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:11 2024 +0000

    Add coco_val and coco_test tasks to coco.yaml

commit 5bf643f73d06f1e540897b753450352bb92fd9ec
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 04:58:28 2024 +0000

    Update dataset_path in flickr30k.yaml

commit 95f110f0eef5196205bc501367e3642c57cc7a17
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:12:25 2024 +0000

    Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

commit c844ae49b18c1334711832208b0359c9439fe1c0
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:10:18 2024 +0000

    Add submission folder and update file paths for storing prediction results

commit 842fbc6f2da7d9a118adf9ec27c3d8542d74168e
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4bf0504fabc3b62f356c467b2fd1119083d27313
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

    * Update dataset paths and improve user prompts

commit f0446227f0dd93651e9d6c06254bbf5212ede2dd
Merge: c6370bf a0b87f5
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:07 2024 +0800

    Merge branch 'main' into dev/bli_add_datasets

commit 1e1f6cfccba758dc606fa4217102518fab73c936
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 14:08:06 2024 +0000

    Update dataset paths and improve user prompts

commit 966933754b9e5179995b3ab41d746603e13e75c6
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 11:52:33 2024 +0000

    Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

commit 767f7e2cae60cf67ec5878234d84321395a3ed15
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* Fix bug in login functionality

* create vqav2_val

* Update vqav2_test.yaml

* Update vqav2_test.yaml

* Update vqav2_val.yaml

---------

Co-authored-by: Li Bo <drluodian@gmail.com>

* vqav2 (#25)

* Update tqdm progress bar position

* Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384'

* Squashed commit of the following:

commit 18e984cfe173390843c73048a931baa17800f918
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

commit ecb47d73d6e000b472be6c5c0cdc9413c7734384
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit dc23f4b42b1dd60b41904d7ddbee1412d6851077
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

    * Update dataset paths and improve user prompts

commit 95ef3ea519cbd772924f9a6afa5394979eb00432
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* remove useless output file

* Update dataset path in vqav2.yaml

* Squashed commit of the following:

commit 75bb7043ea5a533ab6351fc0f5ab055e86106423
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:56:45 2024 +0800

    Black lint

commit 6635a8aa34cfbd3c7a4afb6fcd214a7283ce01cb
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:47 2024 +0800

    Solve doc_iterator_for_counting crashing issue

commit 080f42b88ea8acacd527b8d67b84ba1d7d135b03
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:13 2024 +0800

    Exclude train in refcoco/+/g config

commit 4da84069c08c95e49e8ab0e64a1e103ff7ac8730
Merge: 6a1ae69 697a438
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:13 2024 +0000

    Merge branch 'dev/bli_add_datasets' of https://github.com/EvolvingLMMs-Lab/lmms-eval into dev/bli_add_datasets

commit 6a1ae69923d79ae32a001edac38206b605274ec3
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:06 2024 +0000

    Fix file path and raise error if config file does not exist

commit 697a4387827ceeec3e393237dd1baa217c714c88
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 00:47:24 2024 +0800

    Fix tasks issue for nocaps, refcoco/+/g

commit 47e40437126d39a5f062c9a33b4de426c1a29804
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 10:09:43 2024 +0000

    Remove unused files and update task configuration

commit 9976eb8e9ed03c8613725fdbd822ef5d8cf70e47
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:43:56 2024 +0000

    Add submission file for coco, flickr30k, nocaps, and textcaps tasks

commit 95f97a69faa6129676e89eee14960fcfe2076b7c
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:32:54 2024 +0000

    Refactor get_task_dict function to handle nested groups

commit 3b79ee842b2488714baf92ab34528ef77989d392
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:13:46 2024 +0000

    Fix bug in login functionality

    Refactor code for better performance

    Add new feature for user authentication

    Update UI layout for improved user experience

    Fix typo in variable name

    Optimize database queries for faster response time

    Add error handling for edge cases

    Update dependencies to latest versions

    Remove unused code

    Improve code readability and maintainability

commit f5c353f2ce93a2d96add4312b695b57432f68cbb
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 17:07:20 2024 +0800

    Fix cli itself can not run with config file

commit 9a68fec37be74cfe8d4a73390bc83edee147ae24
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:09:04 2024 +0000

    Squashed commit of the following:

    commit 18e984cfe173390843c73048a931baa17800f918
    Author: Zhang Peiyuan <a1286225768@gmail.com>
    Date:   Thu Jan 25 17:08:25 2024 +0800

        add model specific prompt and gen kwargs in sqa (#19)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

commit 93f847c5851fd246716367935d6b807b17d53949
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:02:57 2024 +0000

    Squashed commit of the following:

    commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit 63739fc6fa0a462d807ae81de0db0173102de584
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit edcc752f97ea3845cefad56624e5d2855066f680
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
    Merge: 7e8b57d 4d11dce
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 5f55126484a7c9325db586d26cf2052538222804
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit fa4ad4404e26d8924f55208746dbb9143b464011
Merge: 22c3adf 4d11dce
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:43:15 2024 +0000

    Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384' into dev/bli_add_datasets

commit 22c3adfd0645acc23b6d7c06b487f4ffd47666c4
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:52 2024 +0000

    Squashed commit of the following:

    commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit 63739fc6fa0a462d807ae81de0db0173102de584
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit edcc752f97ea3845cefad56624e5d2855066f680
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
    Merge: 7e8b57d 4d11dce
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit ecb47d73d6e000b472be6c5c0cdc9413c7734384
    Author: kcz358 <92624596+kcz358@users.noreply.github.com>
    Date:   Thu Jan 25 09:47:31 2024 +0800

        [Dataset] Add flickr30k (#18)

        * Add flickr30k support

        * Black lint

        * Align prompt with NoCaps

    commit dc23f4b42b1dd60b41904d7ddbee1412d6851077
    Author: Li Bo <drluodian@gmail.com>
    Date:   Wed Jan 24 22:10:14 2024 +0800

        [Datasets] modify NoCaps data path and prompts (#17)

        * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

        * Update dataset paths and improve user prompts

    commit 5f55126484a7c9325db586d26cf2052538222804
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit 4c712336b6f7438e717a865910bb241e413a4688
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:11 2024 +0000

    Add coco_val and coco_test tasks to coco.yaml

commit b5547126c855927fd4dc8384211e4aceee40870f
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 04:58:28 2024 +0000

    Update dataset_path in flickr30k.yaml

commit f786f61e2559f082072f21aa9030e2080ddaf809
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:12:25 2024 +0000

    Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384'

commit 796a011000e0df90f66f8e80cb34dc2318ae9ac8
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:10:18 2024 +0000

    Add submission folder and update file paths for storing prediction results

commit ecb47d73d6e000b472be6c5c0cdc9413c7734384
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit dc23f4b42b1dd60b41904d7ddbee1412d6851077
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

    * Update dataset paths and improve user prompts

commit 118744c63eb2d9724571d85fbbd85fcc9ad05b59
Merge: c6370bf a0b87f5
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:07 2024 +0800

    Merge branch 'main' into dev/bli_add_datasets

commit c6370bff65903681f00cf3d07111d8e15a57b619
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 14:08:06 2024 +0000

    Update dataset paths and improve user prompts

commit 810daf458fa94cb3ec2b4a6cc5ecb1e656a24002
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 11:52:33 2024 +0000

    Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

commit 95ef3ea519cbd772924f9a6afa5394979eb00432
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* Fix bug in login functionality

* create vqav2_val

* Update vqav2_test.yaml

* Update vqav2_test.yaml

* Update vqav2_val.yaml

---------

Co-authored-by: Li Bo <drluodian@gmail.com>

* vizwiz dataset (#24)

* Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 2626383d99b5eac59d531ca0f293df960570c524
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
Merge: 7e8b57d 4d11dce
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit d4e8e2552d40752…
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this issue Oct 6, 2024
* Add COCO, RefCOCO, RefCOCO+, RefCOCOg (#5)

* Update author name and email in pyproject.toml

* add mmvet and try to modify llava arch

* Add coco, refcoco support

* Fix doc_to_visual error

* Fix segmentation mask error

* Add refcoco+, refcocog

* Remove debug code

* black lint

* Remove unused code and scripts

* Fix group stderr N/A error between str and int

* Fix letter case issue

* Update lmms_eval tasks and utils

* Fix coco test_split name

* Add llava-bench-in-the-wild support

* Black codestyle, lint

* Add COCO evaluation metric

* Add refcoco, refcocog, refcoco+ evaluation kit

* Add llava bench coco support

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* VQAv2 eval (#4)

* vqav2

* Add vqav2_process_results function and update vqav2_doc_to_text function

* Implement vqav2_process_results function to return exact match score

* Refactor fewshot_docs() to use config.fewshot_config

* Refactor Task class to handle fewshot_docs when training and validation docs are not available

* Add answer processing logic in vqav2_process_results function

* Refactor vqav2_process_results function and add submission aggregation

* Add vqav2_aggreate_submissions function to utils.py

* textvqa

* Refactor answer processing in textvqa_process_results() function

* textvqa eval

* Update dataset path and modify textvqa_doc_to_text function

* Capitalize the question in textvqa_doc_to_text function

* Update textvqa.yaml and utils.py

* Fix formatting issues in lmms_eval/api/task.py, lmms_eval/tasks/gqa/utils.py, lmms_eval/tasks/textvqa/utils.py, and lmms_eval/tasks/vqav2/utils.py

---------

Co-authored-by: Li Bo <drluodian@gmail.com>

* [Big Changes] add LLaVA-1.6, MMVet, LLaVA-W, POPE, and many other changes on logs, model args. (#7)

* Update author name and email in pyproject.toml

* add mmvet and try to modify llava arch

* black lint

* Remove unused code and scripts

* Update lmms_eval tasks and utils

* Update LMMS-Eval dependencies and configurations

* Squashed commit of the following:

commit 209f3904f33210bec0b4b146e96fcbd67a4e1541
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Wed Jan 17 20:27:13 2024 +0800

    Add COCO, RefCOCO, RefCOCO+, RefCOCOg (#5)

    * Update author name and email in pyproject.toml

    * add mmvet and try to modify llava arch

    * Add coco, refcoco support

    * Fix doc_to_visual error

    * Fix segmentation mask error

    * Add refcoco+, refcocog

    * Remove debug code

    * black lint

    * Remove unused code and scripts

    * Fix group stderr N/A error between str and int

    * Fix letter case issue

    * Update lmms_eval tasks and utils

    * Fix coco test_split name

    * Add llava-bench-in-the-wild support

    * Black codestyle, lint

    * Add COCO evaluation metric

    * Add refcoco, refcocog, refcoco+ evaluation kit

    * Add llava bench coco support

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

commit f102f038a161fe667628accd2d9daa33e70fe74f
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Wed Jan 17 20:26:58 2024 +0800

    Update utils.py (#6)

* Fix logging issue and remove unnecessary whitespace

* Add openai and pycocoevalcap dependencies

* Fix device mapping issue in Llava constructor

* Add support for truncating context in generation

* Update Llava model and evaluation configuration

* Update YAML configuration files

* Update YAML configuration files

* add otterhd and gemini models

* Add support for custom image aspect ratio in Llava model

* Add dataset_kwargs and max_gen_toks to YAML files

* Fix log_samples suffix typo and use hash for output name

* Refactor LMMS evaluation code and update LLAVA model properties

* matched response for mistral-llava

* Refactor logging in llava_aggregation function

* Print evaluation statistics instead of logging them

* Fix logging information in llava_aggregation function

* Add new models and dataset_kwargs for COCO tasks

* Update truncate_context parameter in Llava class constructor

* Update dataset_kwargs in YAML files

* Remove issue type tags from issue and pull request templates

* add mmvet and try to modify llava arch

* black lint

* Update lmms_eval tasks and utils

* Update LMMS-Eval dependencies and configurations

* Squashed commit of the following:

commit 209f3904f33210bec0b4b146e96fcbd67a4e1541
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Wed Jan 17 20:27:13 2024 +0800

    Add COCO, RefCOCO, RefCOCO+, RefCOCOg (#5)

    * Update author name and email in pyproject.toml

    * add mmvet and try to modify llava arch

    * Add coco, refcoco support

    * Fix doc_to_visual error

    * Fix segmentation mask error

    * Add refcoco+, refcocog

    * Remove debug code

    * black lint

    * Remove unused code and scripts

    * Fix group stderr N/A error between str and int

    * Fix letter case issue

    * Update lmms_eval tasks and utils

    * Fix coco test_split name

    * Add llava-bench-in-the-wild support

    * Black codestyle, lint

    * Add COCO evaluation metric

    * Add refcoco, refcocog, refcoco+ evaluation kit

    * Add llava bench coco support

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

commit f102f038a161fe667628accd2d9daa33e70fe74f
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Wed Jan 17 20:26:58 2024 +0800

    Update utils.py (#6)

* Fix logging issue and remove unnecessary whitespace

* Add openai and pycocoevalcap dependencies

* Fix device mapping issue in Llava constructor

* Add support for truncating context in generation

* Update Llava model and evaluation configuration

* Update YAML configuration files

* Update YAML configuration files

* add otterhd and gemini models

* Add support for custom image aspect ratio in Llava model

* Add dataset_kwargs and max_gen_toks to YAML files

* Fix log_samples suffix typo and use hash for output name

* Refactor LMMS evaluation code and update LLAVA model properties

* matched response for mistral-llava

* Refactor logging in llava_aggregation function

* Print evaluation statistics instead of logging them

* Fix logging information in llava_aggregation function

* Add new models and dataset_kwargs for COCO tasks

* Update truncate_context parameter in Llava class constructor

* Update dataset_kwargs in YAML files

* Remove issue type tags from issue and pull request templates

* Refactor pope utils functions

* Update transformers dependency to version 4.36.2

* Revise llava-in-the-wild prompt for align

* Add default values for gen_kwargs in Llava class

* Fix formatting issues and import pdb for debugging

* Remove pdb.set_trace() and update default value for max_new_tokens

* Add llava loglikelihood

* Fix formatting and indentation issues in lmms_eval/api/metrics.py and lmms_eval/models/llava.py

* Update function to handle edge cases

This commit updates the function to handle edge cases, improving the overall reliability and robustness of the code.

* Update black version in pre-commit config

* Remove duplicate lines in gqa

* Another way to solve memory issue

* Handle exception in model generation

* Refactor pope_aggregate_results to use "score" key instead of "pope_accuracy"

* Update pope metrics aggregation functions

* Add model_to_prompt in pope.yaml

* Update pope.yaml configuration

* Refactor code to simplify construct_requests call

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>

* Add datetime to output name in cli_evaluate function

Add get_datetime_str function to utils.py

* Refactor pope_aggregate_f1_score function

* Fix datetime format in get_datetime_str function

* Update JSON dump indentation in cli_evaluate function

* Add datetime to output name in cli_evaluate function (#10)

* Revert "Add datetime to output name in cli_evaluate function"

This reverts commit ef26f78c46b50d8769a4fb6990b909162c2881c3.

* Add datetime to output name in cli_evaluate function

* [Datasets] Added POPE and Aligned. (#11)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* [Dataset] Add SEED-Bench, TextCaps, NoCaps (#12)

* Change coco from print to logger

* Add llava loglikelihood

* Add Nocaps support

* Fix pass through function

* Add textcaps support

* Fix textcaps eval image_id

* Add seedbench support

* Add seedbench ppl evaluation

* black lint

* [Datasets] Add four internal evaluation datasets (#13)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* Remove unused variable in mmvet_process_results function

* Remove unused imports in utils.py

* Refactor get_chat_response function to include retries for API requests

* Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

* Update prompt variable in lmms_eval tasks

* Refactor output_name variable in cli_evaluate function

* Fix logging message in mmvet_process_results function

* Update sleep time in get_chat_response function

* Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

* Refactor get_eval function to include retries

* Add token parameter to load_dataset function in gqa_doc_to_visual

* Refactor llava_process_results and llava_aggregation functions

* [Datasets] Add four internal evaluation datasets (#13)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* Remove unused variable in mmvet_process_results function

* Remove unused imports in utils.py

* Refactor get_chat_response function to include retries for API requests

* Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

* Update prompt variable in lmms_eval tasks

* Refactor output_name variable in cli_evaluate function

* Fix logging message in mmvet_process_results function

* Update sleep time in get_chat_response function

* Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

* Refactor get_eval function to include retries

* Add token parameter to load_dataset function in gqa_doc_to_visual

* Refactor llava_process_results and llava_aggregation functions

* add mmmu (#15)

* add mmme

* black

* add mmmu (#15)

* add mmme

* black

* [Memory issue] Solve memory issue for building context (#14)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* Remove unused variable in mmvet_process_results function

* Remove unused imports in utils.py

* Refactor get_chat_response function to include retries for API requests

* Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

* Update prompt variable in lmms_eval tasks

* Refactor output_name variable in cli_evaluate function

* Fix logging message in mmvet_process_results function

* Update sleep time in get_chat_response function

* Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

* Refactor get_eval function to include retries

* Add token parameter to load_dataset function in gqa_doc_to_visual

* Refactor llava_process_results and llava_aggregation functions

* Remove unused function llava_aggregation

* Refractor llava-bench aggregation code

* Add logs and scripts to .gitignore, and set image_aspect_ratio to original in scienceqa.yaml

* Update generation parameters in scienceqa.yaml

* Solve memory issue for building context

* Solved gather result error

* Update lmms_eval scienceqa_img config

* Fixed nocaps store results

* Revise seedbench prompt

* Squashed commit of the following:

commit 290126e6a269db4cca9b3544bd017d6c17012793
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Wed Jan 24 14:07:36 2024 +0800

    add mmmu (#15)

    * add mmme

    * black

commit 8b0227cd7b2602d096d773a01b2199d1f4110f22
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 10:00:33 2024 +0800

    [Datasets] Add four internal evaluation datasets (#13)

    * Update generation_kwargs in pope.yaml

    * Update pope_doc_to_text function

    * Remove unused variable in mmvet_process_results function

    * Remove unused imports in utils.py

    * Refactor get_chat_response function to include retries for API requests

    * Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

    * Update prompt variable in lmms_eval tasks

    * Refactor output_name variable in cli_evaluate function

    * Fix logging message in mmvet_process_results function

    * Update sleep time in get_chat_response function

    * Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

    * Refactor get_eval function to include retries

    * Add token parameter to load_dataset function in gqa_doc_to_visual

    * Refactor llava_process_results and llava_aggregation functions

commit fec494dbe5971e8fa5a886b191a4781be3ce7a6f
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Tue Jan 23 19:17:40 2024 +0800

    [Dataset] Add SEED-Bench, TextCaps, NoCaps (#12)

    * Change coco from print to logger

    * Add llava loglikelihood

    * Add Nocaps support

    * Fix pass through function

    * Add textcaps support

    * Fix textcaps eval image_id

    * Add seedbench support

    * Add seedbench ppl evaluation

    * black lint

commit 4c3c2c63a681f29c537c2467957de1a90568748d
Author: Li Bo <drluodian@gmail.com>
Date:   Tue Jan 23 19:17:12 2024 +0800

    [Datasets] Added POPE and Aligned. (#11)

    * Update generation_kwargs in pope.yaml

    * Update pope_doc_to_text function

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* [Memory issue] Solve memory issue for building context (#14)

* Update generation_kwargs in pope.yaml

* Update pope_doc_to_text function

* Remove unused variable in mmvet_process_results function

* Remove unused imports in utils.py

* Refactor get_chat_response function to include retries for API requests

* Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

* Update prompt variable in lmms_eval tasks

* Refactor output_name variable in cli_evaluate function

* Fix logging message in mmvet_process_results function

* Update sleep time in get_chat_response function

* Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

* Refactor get_eval function to include retries

* Add token parameter to load_dataset function in gqa_doc_to_visual

* Refactor llava_process_results and llava_aggregation functions

* Remove unused function llava_aggregation

* Refractor llava-bench aggregation code

* Add logs and scripts to .gitignore, and set image_aspect_ratio to original in scienceqa.yaml

* Update generation parameters in scienceqa.yaml

* Solve memory issue for building context

* Solved gather result error

* Update lmms_eval scienceqa_img config

* Fixed nocaps store results

* Revise seedbench prompt

* Squashed commit of the following:

commit c3cc24a89415aeccad31ccbb10642af677cd6fe5
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Wed Jan 24 14:07:36 2024 +0800

    add mmmu (#15)

    * add mmme

    * black

commit 0dbc5d16c4f45ebea8def5f0bc1a36fcd93f9a05
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 10:00:33 2024 +0800

    [Datasets] Add four internal evaluation datasets (#13)

    * Update generation_kwargs in pope.yaml

    * Update pope_doc_to_text function

    * Remove unused variable in mmvet_process_results function

    * Remove unused imports in utils.py

    * Refactor get_chat_response function to include retries for API requests

    * Update gpt_eval_model_name in lmms_eval/tasks/dc100_en.yaml and add retry logic in get_chat_response function

    * Update prompt variable in lmms_eval tasks

    * Refactor output_name variable in cli_evaluate function

    * Fix logging message in mmvet_process_results function

    * Update sleep time in get_chat_response function

    * Merge commit 'fec494dbe5971e8fa5a886b191a4781be3ce7a6f'

    * Refactor get_eval function to include retries

    * Add token parameter to load_dataset function in gqa_doc_to_visual

    * Refactor llava_process_results and llava_aggregation functions

commit fec494dbe5971e8fa5a886b191a4781be3ce7a6f
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Tue Jan 23 19:17:40 2024 +0800

    [Dataset] Add SEED-Bench, TextCaps, NoCaps (#12)

    * Change coco from print to logger

    * Add llava loglikelihood

    * Add Nocaps support

    * Fix pass through function

    * Add textcaps support

    * Fix textcaps eval image_id

    * Add seedbench support

    * Add seedbench ppl evaluation

    * black lint

commit 4c3c2c63a681f29c537c2467957de1a90568748d
Author: Li Bo <drluodian@gmail.com>
Date:   Tue Jan 23 19:17:12 2024 +0800

    [Datasets] Added POPE and Aligned. (#11)

    * Update generation_kwargs in pope.yaml

    * Update pope_doc_to_text function

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* Add output path file naming convention (#16)

Update datetime format in get_datetime_str() function

* Add output path file naming convention (#16)

Update datetime format in get_datetime_str() function

* [Datasets] modify NoCaps data path and prompts (#17)

* Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

* Update dataset paths and improve user prompts

* [Datasets] modify NoCaps data path and prompts (#17)

* Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

* Update dataset paths and improve user prompts

* [Dataset] Add flickr30k (#18)

* Add flickr30k support

* Black lint

* Align prompt with NoCaps

* [Dataset] Add flickr30k (#18)

* Add flickr30k support

* Black lint

* Align prompt with NoCaps

* add model specific prompt and gen kwargs in sqa (#19)

* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* add model specific prompt and gen kwargs in sqa (#19)

* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* Dev/add chartqa and ai2d (#23)

* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* add chartqa

* black

* add ai2d

* black

* update chartqa

* blacl

* update ai2d dataset

* black

* Add 'submissions/' directory to .gitignore

* Add Python setup and Black version installation workflow
Refactor ContextSampler class in samplers.py
Remove unnecessary line in DecontaminationFilter class
Update dependencies in pyproject.toml

* Refactor code in ContextSampler class

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* Dev/add chartqa and ai2d (#23)

* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* add chartqa

* black

* add ai2d

* black

* update chartqa

* blacl

* update ai2d dataset

* black

* Add 'submissions/' directory to .gitignore

* Add Python setup and Black version installation workflow
Refactor ContextSampler class in samplers.py
Remove unnecessary line in DecontaminationFilter class
Update dependencies in pyproject.toml

* Refactor code in ContextSampler class

---------

Co-authored-by: Bo Li <drluodian@gmail.com>

* [Datasets] Changes for Flickr30K and NoCaps, also merged Peiyuan's Model Specific Prompt. (#20)

* Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 2626383d99b5eac59d531ca0f293df960570c524
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
Merge: 7e8b57d 4d11dce
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 842fbc6f2da7d9a118adf9ec27c3d8542d74168e
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4bf0504fabc3b62f356c467b2fd1119083d27313
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

    * Update dataset paths and improve user prompts

commit 520c7a2cafe60810aca79df814ce6829d4576032
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 2626383d99b5eac59d531ca0f293df960570c524
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
Merge: 7e8b57d 4d11dce
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 520c7a2cafe60810aca79df814ce6829d4576032
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit a2cc9303dc72e4d53983bb56e54a32e977c3e270
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit 35e87e7c7a480d005abf607c2527a35457d92311
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 89755323596b85208ed33aa88c296604a39af6eb
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 0b0d30dfb247c5f0b7b68398b9e9fcde74cf7fa2
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit e273f9cbd91540df86bdbc652bff88a847bd0d2d
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit e84126aaaf8a07bd371a0571a914ccbcd3697f20
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 110deab53dc1a2fd349b1872cd261b69074c5fa8
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit 0fa3e0c40075997ea80ed976bdee9615f17d3ece
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit 2aaca579120def99860f90054233f3358950fa66
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Add download configuration for dataset

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* add datasets

* Update gpt_eval_model_name in mathvista.yaml

* Merge commit '817eb057bcb61226b33d3ac3c8def01c36c90f96'

* Update pyproject.toml with dependencies and URLs

* Squashed commit of the following:

commit f253968ad703f682a29317bdd51ec6c1fd7c5465
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Sat Jan 27 13:56:37 2024 +0800

    Dev/add chartqa and ai2d (#23)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * Add 'submissions/' directory to .gitignore

    * Add Python setup and Black version installation workflow
    Refactor ContextSampler class in samplers.py
    Remove unnecessary line in DecontaminationFilter class
    Update dependencies in pyproject.toml

    * Refactor code in ContextSampler class

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

* Refactor image processing and submission file path

* Refactor directory creation logic in cli_evaluate_single function

* Update dataset path and test split in vqav2.yaml

* Remove "total" column from cap_details_columns DataFrame

* Add retry logic for dataset download

* Add 'tenacity' to dependencies in pyproject.toml

* Refactor code in ContextSampler class

* Update Black version and configuration, and improve code readability in ContextSampler

* Update Black version and line length

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
Co-authored-by: Fanyi Pu <FPU001@e.ntu.edu.sg>

* [Datasets] Changes for Flickr30K and NoCaps, also merged Peiyuan's Model Specific Prompt. (#20)

* Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 4d11dce
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit ecb47d73d6e000b472be6c5c0cdc9413c7734384
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit dc23f4b42b1dd60b41904d7ddbee1412d6851077
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

    * Update dataset paths and improve user prompts

commit 5f55126484a7c9325db586d26cf2052538222804
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit 63739fc6fa0a462d807ae81de0db0173102de584
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit edcc752f97ea3845cefad56624e5d2855066f680
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
Merge: 7e8b57d 4d11dce
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa

commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:36:46 2024 +0800

    add model specific prompt and gen kwargs

commit 5f55126484a7c9325db586d26cf2052538222804
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:56:51 2024 +0800

    black

commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
Author: jzhang38 <a1286225768@gmail.com>
Date:   Wed Jan 24 13:55:43 2024 +0800

    add mmme

* Squashed commit of the following:

commit 18e984cfe173390843c73048a931baa17800f918
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Fix cli itself can not run with config file

* Fix bug in login functionality

Refactor code for better performance

Add new feature for user authentication

Update UI layout for improved user experience

Fix typo in variable name

Optimize database queries for faster response time

Add error handling for edge cases

Update dependencies to latest versions

Remove unused code

Improve code readability and maintainability

* Refactor get_task_dict function to handle nested groups

* Add submission file for coco, flickr30k, nocaps, and textcaps tasks

* Remove unused files and update task configuration

* Fix tasks issue for nocaps, refcoco/+/g

* Fix file path and raise error if config file does not exist

* Exclude train in refcoco/+/g config

* Solve doc_iterator_for_counting crashing issue

* Black lint

* Refactor code to improve performance and readability

* Squashed commit of the following:

commit 0df825c9e72a06e6acb4c0bd43c2083ffe8b74c0
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:03:57 2024 +0800

    change okvqa yaml

commit b9d9f9896993033b92346e9f47420c55b866c715
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:55:40 2024 +0800

    change yaml

commit 4256bef410e4c8d8761e0cd0d79ac5e57b97651b
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:42:43 2024 +0800

    add okvqa task

commit 18e984cfe173390843c73048a931baa17800f918
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Squashed commit of the following:

commit 0c8a3919885b8fe2880bb2892f7a619d060012d1
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:06:02 2024 +0800

    change ocr reference

commit d2bc7c92ac61179b8c4031e11bc31970355252f6
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 01:05:46 2024 +0800

    revert example_eval

commit c78fa29cd0d161641ee05db57bd39314b998c8c7
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Fri Jan 26 00:17:28 2024 +0800

    edit vizwiz utils

commit 397f0906968fd8ba04b883469b96217737c43e09
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:49:47 2024 +0800

    reorganize __init__

commit 52a7ea6c7599adeec2ac2787f500e215ce47cf79
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 23:46:20 2024 +0800

    minor fixes

commit f706b2aaf9b288c582611191a1841b58feaeb741
Author: JvThunder <joshuaadrianc@gmail.com>
Date:   Thu Jan 25 17:41:03 2024 +0800

    add vizwizvqa eval rask

commit 18e984cfe173390843c73048a931baa17800f918
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

* Refactor mathvista.yaml and utils.py

* Add gpt_eval_score to mathvista_process_results

* Refactor mathvista_aggregate_results to return average accuracy score

* Fix refcoco evaluation error

* Fix evaluation problem for refcoco+/g

* Refactor mathvista.yaml and mathvista_evals.py

* Add dependencies and update YAML files

* Refactor mmbench_en/utils.py to save test results to separate Excel file

* Fix caption task prompt

* Add group field to mmbench_en_test and mmbench_en_val yaml files

* Delete mmbench_en_val.yaml file

* Update mmbench_cn.yaml and mmbench_cn_test.yaml

* Update mmbench_cn_val.yaml and utils.py

* Remove unused fields in mmbench_cn_cc_process_results function

* Update aggregation function for mmbench_en_dev.yaml

* Fix capitalization of L2-category key in utils.py

* Fix variable name in mmbench_process_results function

* Delete mmbench_cn_val.yaml file

* Update mathvista_test.yaml and mathvista_testmini.yaml

* Fix warnings and update mathvista.yaml

* Remove system message from MathVistaEvaluator

* Update GPT model version in MathVistaEvaluator constructor

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* change vizwiz to test set

* Add split flag to mathvista_aggregate_results function

* Add higher_is_better: false to gpt_eval_info metric in d170_cn, d170_en, dc100_en, and dc200_cn yaml files

* Add download configuration for dataset

* Update GQA_RAW_IMAGE_DATASET path in utils.py

* add datasets

* Update gpt_eval_model_name in mathvista.yaml

* Merge commit '0d620f98b49f8204d02633f209eedd5d8b7a1f7c'

* Update pyproject.toml with dependencies and URLs

* Squashed commit of the following:

commit 8b600f55b6cf5627504c407871539db59f6085a3
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Sat Jan 27 13:56:37 2024 +0800

    Dev/add chartqa and ai2d (#23)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * Add 'submissions/' directory to .gitignore

    * Add Python setup and Black version installation workflow
    Refactor ContextSampler class in samplers.py
    Remove unnecessary line in DecontaminationFilter class
    Update dependencies in pyproject.toml

    * Refactor code in ContextSampler class

    ---------

    Co-authored-by: Bo Li <drluodian@gmail.com>

* Refactor image processing and submission file path

* Refactor directory creation logic in cli_evaluate_single function

* Update dataset path and test split in vqav2.yaml

* Remove "total" column from cap_details_columns DataFrame

* Add retry logic for dataset download

* Add 'tenacity' to dependencies in pyproject.toml

* Refactor code in ContextSampler class

* Update Black version and configuration, and improve code readability in ContextSampler

* Update Black version and line length

---------

Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com>
Co-authored-by: Fanyi Pu <FPU001@e.ntu.edu.sg>

* vqav2 (#25)

* Update tqdm progress bar position

* Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

* Squashed commit of the following:

commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

commit 842fbc6f2da7d9a118adf9ec27c3d8542d74168e
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4bf0504fabc3b62f356c467b2fd1119083d27313
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

    * Update dataset paths and improve user prompts

commit 767f7e2cae60cf67ec5878234d84321395a3ed15
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* remove useless output file

* Update dataset path in vqav2.yaml

* Squashed commit of the following:

commit eeb2b9827502f044ef67d8440f53124baf219ba3
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:56:45 2024 +0800

    Black lint

commit 1ce9f0b37e4bc5e6ff5fbfcd23fd339eb14974ae
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:47 2024 +0800

    Solve doc_iterator_for_counting crashing issue

commit e12b3bb41ed4f51540cfac84e5e96d15777540c4
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:13 2024 +0800

    Exclude train in refcoco/+/g config

commit 42c56f82bc4ccae12e19e76d09d7e525ca9ef2f4
Merge: 6a1ae69 697a438
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:13 2024 +0000

    Merge branch 'dev/bli_add_datasets' of https://github.com/EvolvingLMMs-Lab/lmms-eval into dev/bli_add_datasets

commit aed08303fe87808986d206540a0c0ee6d8764988
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:06 2024 +0000

    Fix file path and raise error if config file does not exist

commit a105386613c443d9e740c89725cbd1281bbdfef6
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 00:47:24 2024 +0800

    Fix tasks issue for nocaps, refcoco/+/g

commit 21c8119e377760f44c769bed2528d863a8f4333b
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 10:09:43 2024 +0000

    Remove unused files and update task configuration

commit 0ccb2629c2aacdb297b7cf0c9c2bcfa386bb7582
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:43:56 2024 +0000

    Add submission file for coco, flickr30k, nocaps, and textcaps tasks

commit 5365e13e93c702a1e0e259ee6a08d6a427d72470
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:32:54 2024 +0000

    Refactor get_task_dict function to handle nested groups

commit 6773348c807bcfa1b09ceffc90c75e15cad908f7
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:13:46 2024 +0000

    Fix bug in login functionality

    Refactor code for better performance

    Add new feature for user authentication

    Update UI layout for improved user experience

    Fix typo in variable name

    Optimize database queries for faster response time

    Add error handling for edge cases

    Update dependencies to latest versions

    Remove unused code

    Improve code readability and maintainability

commit 31140f9c87dea89ca94c94bc850e3a8d43e5f8b4
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 17:07:20 2024 +0800

    Fix cli itself can not run with config file

commit df1bad47f6ed13f94848d2bee29b28e00c2384b2
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:09:04 2024 +0000

    Squashed commit of the following:

    commit b13a805623dfd9d826ddd440e1b5ecde773fbb12
    Author: Zhang Peiyuan <a1286225768@gmail.com>
    Date:   Thu Jan 25 17:08:25 2024 +0800

        add model specific prompt and gen kwargs in sqa (#19)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

commit 06383aa4a5ff59db52fc8d584f3086efd88b7e74
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:02:57 2024 +0000

    Squashed commit of the following:

    commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit 2626383d99b5eac59d531ca0f293df960570c524
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
    Merge: 7e8b57d 4d11dce
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 520c7a2cafe60810aca79df814ce6829d4576032
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit 7a71fd6022ee5985100dda38b94956595cec77a5
Merge: 22c3adf 4d11dce
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:43:15 2024 +0000

    Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e' into dev/bli_add_datasets

commit 6870cba13cb54976480c1d5e8d97602c246f881b
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:52 2024 +0000

    Squashed commit of the following:

    commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit 2626383d99b5eac59d531ca0f293df960570c524
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
    Merge: 7e8b57d 4d11dce
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit d4e8e2552d40752bfdc5bbf4cd962c1798096258
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 842fbc6f2da7d9a118adf9ec27c3d8542d74168e
    Author: kcz358 <92624596+kcz358@users.noreply.github.com>
    Date:   Thu Jan 25 09:47:31 2024 +0800

        [Dataset] Add flickr30k (#18)

        * Add flickr30k support

        * Black lint

        * Align prompt with NoCaps

    commit 4bf0504fabc3b62f356c467b2fd1119083d27313
    Author: Li Bo <drluodian@gmail.com>
    Date:   Wed Jan 24 22:10:14 2024 +0800

        [Datasets] modify NoCaps data path and prompts (#17)

        * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

        * Update dataset paths and improve user prompts

    commit 520c7a2cafe60810aca79df814ce6829d4576032
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit 3a633240327c078fa4f5a75dbd38ad5bc0d468dd
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit b40d522b6bf483ebdfbf5facd4573de0cf8a93f6
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:11 2024 +0000

    Add coco_val and coco_test tasks to coco.yaml

commit 5bf643f73d06f1e540897b753450352bb92fd9ec
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 04:58:28 2024 +0000

    Update dataset_path in flickr30k.yaml

commit 95f110f0eef5196205bc501367e3642c57cc7a17
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:12:25 2024 +0000

    Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

commit c844ae49b18c1334711832208b0359c9439fe1c0
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:10:18 2024 +0000

    Add submission folder and update file paths for storing prediction results

commit 842fbc6f2da7d9a118adf9ec27c3d8542d74168e
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit 4bf0504fabc3b62f356c467b2fd1119083d27313
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

    * Update dataset paths and improve user prompts

commit f0446227f0dd93651e9d6c06254bbf5212ede2dd
Merge: c6370bf a0b87f5
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:07 2024 +0800

    Merge branch 'main' into dev/bli_add_datasets

commit 1e1f6cfccba758dc606fa4217102518fab73c936
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 14:08:06 2024 +0000

    Update dataset paths and improve user prompts

commit 966933754b9e5179995b3ab41d746603e13e75c6
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 11:52:33 2024 +0000

    Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

commit 767f7e2cae60cf67ec5878234d84321395a3ed15
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* Fix bug in login functionality

* create vqav2_val

* Update vqav2_test.yaml

* Update vqav2_test.yaml

* Update vqav2_val.yaml

---------

Co-authored-by: Li Bo <drluodian@gmail.com>

* vqav2 (#25)

* Update tqdm progress bar position

* Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384'

* Squashed commit of the following:

commit 18e984cfe173390843c73048a931baa17800f918
Author: Zhang Peiyuan <a1286225768@gmail.com>
Date:   Thu Jan 25 17:08:25 2024 +0800

    add model specific prompt and gen kwargs in sqa (#19)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

commit ecb47d73d6e000b472be6c5c0cdc9413c7734384
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit dc23f4b42b1dd60b41904d7ddbee1412d6851077
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

    * Update dataset paths and improve user prompts

commit 95ef3ea519cbd772924f9a6afa5394979eb00432
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* remove useless output file

* Update dataset path in vqav2.yaml

* Squashed commit of the following:

commit 75bb7043ea5a533ab6351fc0f5ab055e86106423
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:56:45 2024 +0800

    Black lint

commit 6635a8aa34cfbd3c7a4afb6fcd214a7283ce01cb
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:47 2024 +0800

    Solve doc_iterator_for_counting crashing issue

commit 080f42b88ea8acacd527b8d67b84ba1d7d135b03
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 09:55:13 2024 +0800

    Exclude train in refcoco/+/g config

commit 4da84069c08c95e49e8ab0e64a1e103ff7ac8730
Merge: 6a1ae69 697a438
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:13 2024 +0000

    Merge branch 'dev/bli_add_datasets' of https://github.com/EvolvingLMMs-Lab/lmms-eval into dev/bli_add_datasets

commit 6a1ae69923d79ae32a001edac38206b605274ec3
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 17:17:06 2024 +0000

    Fix file path and raise error if config file does not exist

commit 697a4387827ceeec3e393237dd1baa217c714c88
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Fri Jan 26 00:47:24 2024 +0800

    Fix tasks issue for nocaps, refcoco/+/g

commit 47e40437126d39a5f062c9a33b4de426c1a29804
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 10:09:43 2024 +0000

    Remove unused files and update task configuration

commit 9976eb8e9ed03c8613725fdbd822ef5d8cf70e47
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:43:56 2024 +0000

    Add submission file for coco, flickr30k, nocaps, and textcaps tasks

commit 95f97a69faa6129676e89eee14960fcfe2076b7c
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:32:54 2024 +0000

    Refactor get_task_dict function to handle nested groups

commit 3b79ee842b2488714baf92ab34528ef77989d392
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:13:46 2024 +0000

    Fix bug in login functionality

    Refactor code for better performance

    Add new feature for user authentication

    Update UI layout for improved user experience

    Fix typo in variable name

    Optimize database queries for faster response time

    Add error handling for edge cases

    Update dependencies to latest versions

    Remove unused code

    Improve code readability and maintainability

commit f5c353f2ce93a2d96add4312b695b57432f68cbb
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 17:07:20 2024 +0800

    Fix cli itself can not run with config file

commit 9a68fec37be74cfe8d4a73390bc83edee147ae24
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:09:04 2024 +0000

    Squashed commit of the following:

    commit 18e984cfe173390843c73048a931baa17800f918
    Author: Zhang Peiyuan <a1286225768@gmail.com>
    Date:   Thu Jan 25 17:08:25 2024 +0800

        add model specific prompt and gen kwargs in sqa (#19)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

commit 93f847c5851fd246716367935d6b807b17d53949
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 09:02:57 2024 +0000

    Squashed commit of the following:

    commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit 63739fc6fa0a462d807ae81de0db0173102de584
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit edcc752f97ea3845cefad56624e5d2855066f680
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
    Merge: 7e8b57d 4d11dce
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit 5f55126484a7c9325db586d26cf2052538222804
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit fa4ad4404e26d8924f55208746dbb9143b464011
Merge: 22c3adf 4d11dce
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:43:15 2024 +0000

    Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384' into dev/bli_add_datasets

commit 22c3adfd0645acc23b6d7c06b487f4ffd47666c4
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:52 2024 +0000

    Squashed commit of the following:

    commit 4d48d0c9b88e62dfebe05ec909b7f1851e9cd75d
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:59:12 2024 +0800

        refactor multi model code

    commit 4a4b7bec200c72332b61a0c277cd8f8a34e4f721
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:51:16 2024 +0800

        print table at the end

    commit 63739fc6fa0a462d807ae81de0db0173102de584
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 11:20:59 2024 +0800

        add yaml config to supprot multi-model eval

    commit edcc752f97ea3845cefad56624e5d2855066f680
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:39:42 2024 +0800

        black

    commit 41f4b63d3a6e83babe92bac32a7432a8ef740bb5
    Merge: 7e8b57d 4d11dce
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:37:57 2024 +0800

        resolve conflicts in sqa

    commit 7e8b57d3bcc21d2a049d3abbc8a8201631641db4
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Thu Jan 25 10:36:46 2024 +0800

        add model specific prompt and gen kwargs

    commit ecb47d73d6e000b472be6c5c0cdc9413c7734384
    Author: kcz358 <92624596+kcz358@users.noreply.github.com>
    Date:   Thu Jan 25 09:47:31 2024 +0800

        [Dataset] Add flickr30k (#18)

        * Add flickr30k support

        * Black lint

        * Align prompt with NoCaps

    commit dc23f4b42b1dd60b41904d7ddbee1412d6851077
    Author: Li Bo <drluodian@gmail.com>
    Date:   Wed Jan 24 22:10:14 2024 +0800

        [Datasets] modify NoCaps data path and prompts (#17)

        * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

        * Update dataset paths and improve user prompts

    commit 5f55126484a7c9325db586d26cf2052538222804
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:56:51 2024 +0800

        black

    commit aa6f8853cf82384fb3b15306fec4769212fbc5ab
    Author: jzhang38 <a1286225768@gmail.com>
    Date:   Wed Jan 24 13:55:43 2024 +0800

        add mmme

commit 4c712336b6f7438e717a865910bb241e413a4688
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 08:38:11 2024 +0000

    Add coco_val and coco_test tasks to coco.yaml

commit b5547126c855927fd4dc8384211e4aceee40870f
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 04:58:28 2024 +0000

    Update dataset_path in flickr30k.yaml

commit f786f61e2559f082072f21aa9030e2080ddaf809
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:12:25 2024 +0000

    Merge commit 'ecb47d73d6e000b472be6c5c0cdc9413c7734384'

commit 796a011000e0df90f66f8e80cb34dc2318ae9ac8
Author: Bo Li <drluodian@gmail.com>
Date:   Thu Jan 25 02:10:18 2024 +0000

    Add submission folder and update file paths for storing prediction results

commit ecb47d73d6e000b472be6c5c0cdc9413c7734384
Author: kcz358 <92624596+kcz358@users.noreply.github.com>
Date:   Thu Jan 25 09:47:31 2024 +0800

    [Dataset] Add flickr30k (#18)

    * Add flickr30k support

    * Black lint

    * Align prompt with NoCaps

commit dc23f4b42b1dd60b41904d7ddbee1412d6851077
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:14 2024 +0800

    [Datasets] modify NoCaps data path and prompts (#17)

    * Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

    * Update dataset paths and improve user prompts

commit 118744c63eb2d9724571d85fbbd85fcc9ad05b59
Merge: c6370bf a0b87f5
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 22:10:07 2024 +0800

    Merge branch 'main' into dev/bli_add_datasets

commit c6370bff65903681f00cf3d07111d8e15a57b619
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 14:08:06 2024 +0000

    Update dataset paths and improve user prompts

commit 810daf458fa94cb3ec2b4a6cc5ecb1e656a24002
Author: Bo Li <drluodian@gmail.com>
Date:   Wed Jan 24 11:52:33 2024 +0000

    Merge commit '95ef3ea519cbd772924f9a6afa5394979eb00432'

commit 95ef3ea519cbd772924f9a6afa5394979eb00432
Author: Li Bo <drluodian@gmail.com>
Date:   Wed Jan 24 19:51:34 2024 +0800

    Add output path file naming convention (#16)

    Update datetime format in get_datetime_str() function

* Fix bug in login functionality

* create vqav2_val

* Update vqav2_test.yaml

* Update vqav2_test.yaml

* Update vqav2_val.yaml

---------

Co-authored-by: Li Bo <drluodian@gmail.com>

* vizwiz dataset (#24)

* Merge commit '767f7e2cae60cf67ec5878234d84321395a3ed15'

* Update dataset paths and improve user prompts

* Add submission folder and update file paths for storing prediction results

* Merge commit '842fbc6f2da7d9a118adf9ec27c3d8542d74168e'

* Update dataset_path in flickr30k.yaml

* Add coco_val and coco_test tasks to coco.yaml

* Squashed commit of the following:

commit 542a34dc5721ecdff6c5c68b0568692ad3a17149
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:59:12 2024 +0800

    refactor multi model code

commit 3c397b8af85192b1821b3b6a0d8b8df746b5347c
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:51:16 2024 +0800

    print table at the end

commit e7b8a2d1f1e7337f02298efafd2ebf81543f4f85
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 11:20:59 2024 +0800

    add yaml config to supprot multi-model eval

commit 2626383d99b5eac59d531ca0f293df960570c524
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:39:42 2024 +0800

    black

commit 8349935fe145e33af0007ad4fb0d71fd925be7a0
Merge: 7e8b57d 4d11dce
Author: jzhang38 <a1286225768@gmail.com>
Date:   Thu Jan 25 10:37:57 2024 +0800

    resolve conflicts in sqa
…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

4 participants