Skip to content

Latest commit

 

History

History
102 lines (87 loc) · 4.29 KB

README.md

File metadata and controls

102 lines (87 loc) · 4.29 KB

FindVehicle and VehicleFinder: A NER dataset for a text-image cross-modal vehicle retrieval system

🔥🔥🔥FindVehicle: The 🔥first🔥 NER dataset in traffic domain for natural language-based vehicle retrieval

🎉🎉🎉VehicleFinder A text-image cross-modal vehicle retrieval system link


FindVehicle

Entity Types of FindVehicle

Dataset Download

Data Link 1: Baidu Cloud Disk Password: xp9o

Data Link 2: Google Drive

Dataset Directory

FindVehicle has 2 data formats, CoNLL-style and jsonlines.

CoNLL-style format

  • FindVehicle_train.txt -> Train set, CoNLL-style annotation, NER Label
  • FindVehicle_test.txt -> Test set, CoNLL-style annotation, NER Label

CoNLL-style Example (Flat Entity)

I O
am O
looking O
for O
a O
white B-vehicle_color
sedan B-vehicle_type
. O

CoNLL-style Example (Overlapped Entity)

I O
am O
looking O
for O
a O
white B-vehicle_color
Audi B-vehicle_brand
Q7 B-vehicle_model
. O

I O
am O
looking O
for O
a O
white B-vehicle_color
Audi B-vehicle_type-suv
Q7 E-vehicle_type-suv
. O

jsonlines format

  • FindVehicle_train.jsonl -> Train set, jsonlines annotation, NER Label, RE Label
  • FindVehicle_test.jsonl -> Test set, jsonlines annotation, NER Label, RE Label

Install jsonlines, then you could read it.

pip install jsonlines

jsonlines Example

{
"id": 41628,
"data": "Let the clever boy help find out the Silver XPeng G3 and lemon yellow Chevrolet Trailblazer in the Bottom Left of the image that driven left .",
"ner_label": [
["vehicle_color", 37, 43, "Silver", 8, 9, ["Silver"]], ### label, char span start index, char span end index, char span check, token span start index, token > > span end index, token span check
["vehicle_brand", 44, 49, "XPeng", 9, 10, ["XPeng"]],
["vehicle_model", 50, 52, "G3", 10, 11, ["G3"]],
["vehicle_color", 57, 69, "lemon yellow", 12, 14, ["lemon", "yellow"]],
["vehicle_brand", 70, 79, "Chevrolet", 14, 15, ["Chevrolet"]],
["vehicle_model", 80, 91, "Trailblazer", 15, 16, ["Trailblazer"]],
["vehicle_location", 99, 110, "Bottom Left", 18, 20, ["Bottom", "Left"]],
["vehicle_orientation", 99, 105, "Bottom", 18, 19, ["Bottom"]]],
"re_label": [[0, 1, 2, 6, 7], [3, 4, 5, 6, 7]]
### the indexes 0,1,2,6,7 refer to one target, indexes 3,4,5,6,7 refer to one target. }

Contributors

  • Runwei Guan [email], University of Liverpool, XJTLU-JITRI, Institute of Deep Perception Technology
  • Feifan Chen [email], University of Liverpool, XJTLU
  • Rongsheng Hu [email], Jiangnan University
  • Shanliang Yao [email], University of Liverpool, XJTLU-JITRI, Institute of Deep Perception Technology
  • Zhou Yuan [email], University of Bristol
  • Sihao Dai [email], University of Southampton
  • Wenjie Zhou [email], Jiangyin Baoneng Precision New Material Co.,Ltd

Citation

@article{guan2024findvehicle,
  title={FindVehicle and VehicleFinder: a NER dataset for natural language-based vehicle retrieval and a keyword-based cross-modal vehicle retrieval system},
  author={Guan, Runwei and Man, Ka Lok and Chen, Feifan and Yao, Shanliang and Hu, Rongsheng and Zhu, Xiaohui and Smith, Jeremy and Lim, Eng Gee and Yue, Yutao},
  journal={Multimedia Tools and Applications},
  volume={83},
  number={8},
  pages={24841--24874},
  year={2024},
  publisher={Springer}
}

Notes: Any problem please send them in Issues.