Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Dataset] fix hallusion benchmark, add saving logic inside aggregate …
…function (EvolvingLMMs-Lab#35) * add fuyu * Merge commit '7b7f6368e8e04cddbd6e7f572f1099b7911cbe04' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 7b7f636 Author: Pu Fanyi <FPU001@e.ntu.edu.sg> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da Author: Zhang Peiyuan <a1286225768@gmail.com> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da Author: Zhang Peiyuan <a1286225768@gmail.com> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 6ee856b Author: Zhang Peiyuan <a1286225768@gmail.com> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit 4a1183c Author: Li Bo <drluodian@gmail.com> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da Author: Zhang Peiyuan <a1286225768@gmail.com> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da Author: Zhang Peiyuan <a1286225768@gmail.com> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <FPU001@e.ntu.edu.sg> Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <drluodian@gmail.com> Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 7664839 Author: Li Bo <drluodian@gmail.com> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (EvolvingLMMs-Lab#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com> commit 05487a4 Author: Li Bo <drluodian@gmail.com> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (EvolvingLMMs-Lab#33) * add fuyu * Merge commit '7b7f6368e8e04cddbd6e7f572f1099b7911cbe04' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 7b7f636 Author: Pu Fanyi <FPU001@e.ntu.edu.sg> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da Author: Zhang Peiyuan <a1286225768@gmail.com> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da Author: Zhang Peiyuan <a1286225768@gmail.com> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 6ee856b Author: Zhang Peiyuan <a1286225768@gmail.com> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit 4a1183c Author: Li Bo <drluodian@gmail.com> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da Author: Zhang Peiyuan <a1286225768@gmail.com> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da Author: Zhang Peiyuan <a1286225768@gmail.com> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <FPU001@e.ntu.edu.sg> Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <drluodian@gmail.com> Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit 7b7f636 Author: Pu Fanyi <FPU001@e.ntu.edu.sg> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da Author: Zhang Peiyuan <a1286225768@gmail.com> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da Author: Zhang Peiyuan <a1286225768@gmail.com> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 6ee856b Author: Zhang Peiyuan <a1286225768@gmail.com> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit 4a1183c Author: Li Bo <drluodian@gmail.com> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da Author: Zhang Peiyuan <a1286225768@gmail.com> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <92624596+kcz358@users.noreply.github.com> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da Author: Zhang Peiyuan <a1286225768@gmail.com> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <FPU001@e.ntu.edu.sg> Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <drluodian@gmail.com> Co-authored-by: kcz358 <92624596+kcz358@users.noreply.github.com> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets
- Loading branch information