Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SingleStream scenario crashes for Resnet50 #2100

Open
nintendoit opened this issue Feb 7, 2025 · 2 comments
Open

SingleStream scenario crashes for Resnet50 #2100

nintendoit opened this issue Feb 7, 2025 · 2 comments

Comments

@nintendoit
Copy link

Run command used :

mlcr run-mlperf,inference,_r5.0-dev \
   --model=resnet50 \
   --implementation=reference \
   --framework=onnxruntime \
   --category=edge \
   --scenario=SingleStream \
   --execution_mode=valid \
   --device=cpu \
   --quiet

Crash log:

./run_local.sh onnxruntime resnet50 cpu --scenario SingleStream --threads 4 --user_conf '/home/mlcuser/MLC/repos/mlcommons@mlperf-automations/script/generate-mlperf-inference-user-conf/tmp/33db40839ea34f5a94ac82e56583382a.conf' --use_preprocessed_dataset --cache_dir /home/mlcuser/MLC/repos/local/cache/get-preprocessed-dataset-imagenet_be6302c3 --dataset-list /home/mlcuser/MLC/repos/local/cache/extract-file_1b015636/val.txt 2>&1 | tee '/home/mlcuser/MLC/repos/local/cache/get-mlperf-inference-results-dir_2bad8c5e/valid_results/6dc4971019f4-reference-cpu-onnxruntime-v1.20.1-default_config/resnet50/singlestream/performance/run_1/console.out'; echo ${PIPESTATUS[0]} > exitstatus
python3 python/main.py --profile resnet50-onnxruntime --model "/home/mlcuser/MLC/repos/local/cache/download-file_90fb1a82/resnet50_v1.onnx" --dataset-path /home/mlcuser/MLC/repos/local/cache/get-preprocessed-dataset-imagenet_be6302c3 --output "/home/mlcuser/MLC/repos/local/cache/get-mlperf-inference-results-dir_2bad8c5e/valid_results/6dc4971019f4-reference-cpu-onnxruntime-v1.20.1-default_config/resnet50/singlestream/performance/run_1" --scenario SingleStream --threads 4 --user_conf /home/mlcuser/MLC/repos/mlcommons@mlperf-automations/script/generate-mlperf-inference-user-conf/tmp/33db40839ea34f5a94ac82e56583382a.conf --use_preprocessed_dataset --cache_dir /home/mlcuser/MLC/repos/local/cache/get-preprocessed-dataset-imagenet_be6302c3 --dataset-list /home/mlcuser/MLC/repos/local/cache/extract-file_1b015636/val.txt
INFO:main:Namespace(dataset='imagenet', dataset_path='/home/mlcuser/MLC/repos/local/cache/get-preprocessed-dataset-imagenet_be6302c3', dataset_list='/home/mlcuser/MLC/repos/local/cache/extract-file_1b015636/val.txt', data_format=None, profile='resnet50-onnxruntime', scenario='SingleStream', max_batchsize=32, model='/home/mlcuser/MLC/repos/local/cache/download-file_90fb1a82/resnet50_v1.onnx', output='/home/mlcuser/MLC/repos/local/cache/get-mlperf-inference-results-dir_2bad8c5e/valid_results/6dc4971019f4-reference-cpu-onnxruntime-v1.20.1-default_config/resnet50/singlestream/performance/run_1', inputs=None, outputs=['ArgMax:0'], backend='onnxruntime', device=None, model_name='resnet50', threads=4, qps=None, cache=0, cache_dir='/home/mlcuser/MLC/repos/local/cache/get-preprocessed-dataset-imagenet_be6302c3', preprocessed_dir=None, use_preprocessed_dataset=True, accuracy=False, find_peak_performance=False, debug=False, user_conf='/home/mlcuser/MLC/repos/mlcommons@mlperf-automations/script/generate-mlperf-inference-user-conf/tmp/33db40839ea34f5a94ac82e56583382a.conf', audit_conf='audit.config', time=None, count=None, performance_sample_count=None, max_latency=None, samples_per_query=8)
INFO:imagenet:Loading 50000 preprocessed images using 4 threads
INFO:imagenet:loaded 50000 images, cache=0, already_preprocessed=True, took=1.7sec
INFO:main:starting TestScenario.SingleStream
./run_local.sh: line 30: 4224 Killed python3 python/main.py --profile resnet50-onnxruntime --model "/home/mlcuser/MLC/repos/local/cache/download-file_90fb1a82/resnet50_v1.onnx" --dataset-path /home/mlcuser/MLC/repos/local/cache/get-preprocessed-dataset-imagenet_be6302c3 --output "/home/mlcuser/MLC/repos/local/cache/get-mlperf-inference-results-dir_2bad8c5e/valid_results/6dc4971019f4-reference-cpu-onnxruntime-v1.20.1-default_config/resnet50/singlestream/performance/run_1" --scenario SingleStream --threads 4 --user_conf /home/mlcuser/MLC/repos/mlcommons@mlperf-automations/script/generate-mlperf-inference-user-conf/tmp/33db40839ea34f5a94ac82e56583382a.conf --use_preprocessed_dataset --cache_dir /home/mlcuser/MLC/repos/local/cache/get-preprocessed-dataset-imagenet_be6302c3 --dataset-list /home/mlcuser/MLC/repos/local/cache/extract-file_1b015636/val.txt
Traceback (most recent call last):
File "/home/mlcuser/venv/mlc/bin/mlcr", line 8, in
sys.exit(mlcr())
File "/home/mlcuser/venv/mlc/lib/python3.10/site-packages/mlc/main.py", line 1354, in mlcr
main()
File "/home/mlcuser/venv/mlc/lib/python3.10/site-packages/mlc/main.py", line 1441, in main
res = method(run_args)
...

Screenshot from htop and System monitor during the execution:

htop

System moniter(kde)

@arjunsuresh
Copy link
Contributor

Tried this and it is working fine. But it is taking about 11GB of RAM and probably that's why it is crashing on 8GB. If you can increase the swap space to 8GB, it should run fine.

arjun_spr-reference-cpu-onnxruntime-v1.20.1-default_config
+----------+--------------+----------+------------+-----------------+
|  Model   |   Scenario   | Accuracy | Throughput | Latency (in ms) |
+----------+--------------+----------+------------+-----------------+
| resnet50 | SingleStream |   76.0   |  154.488   |      6.473      |
+----------+--------------+----------+------------+-----------------+

@arjunsuresh
Copy link
Contributor

Can you use --target_latency=10?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants