You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
set exclude_input_in_output true will core dumped while model gnerate nothing
Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
==== backtrace (tid: 29314) ====
0 0x0000000000042520 __sigaction() ???:0
1 0x000000000006751c tensorrt_llm::batch_manager::GptManager::returnCompletedRequests() :0
2 0x000000000006ce1e tensorrt_llm::batch_manager::GptManager::decoupled_execution_loop() :0
3 0x00000000000dc253 std::error_code::default_error_condition() ???:0
4 0x0000000000094ac3 pthread_condattr_setpshared() ???:0
5 0x0000000000126660 __xmknodat() ???:0
=================================
[multi-test-0:29284] *** Process received signal ***
[multi-test-0:29284] Signal: Segmentation fault (11)
[multi-test-0:29284] Signal code: (-6)
[multi-test-0:29284] Failing at address: 0x7264
[multi-test-0:29284] [ 0] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7fb5b726f520]
[multi-test-0:29284] [ 1] /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x6751c)[0x7fb5434bb51c]
[multi-test-0:29284] [ 2] /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x6ce1e)[0x7fb5434c0e1e]
[multi-test-0:29284] [ 3] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xdc253)[0x7fb5b7531253]
[multi-test-0:29284] [ 4] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x94ac3)[0x7fb5b72c1ac3]
[multi-test-0:29284] [ 5] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x126660)[0x7fb5b7353660]
[multi-test-0:29284] *** End of error message ***
0# 0x000056351A572C2D in /opt/tritonserver/bin/tritonserver
1# 0x00007F731366F520 in /usr/lib/x86_64-linux-gnu/libc.so.6
2# 0x00007F729F4BB51C in /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so
3# 0x00007F729F4C0E1E in /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so
4# 0x00007F7313931253 in /usr/lib/x86_64-linux-gnu/libstdc++.so.6
5# 0x00007F73136C1AC3 in /usr/lib/x86_64-linux-gnu/libc.so.6
6# 0x00007F7313753660 in /usr/lib/x86_64-linux-gnu/libc.so.6
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 0 on node multi-test-0 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
while set exclude_input_in_output false, it works ok without ensemble.
under ensemble mode, and set exclude_input_in_output false, it returns :
{"error":"attempt to access non-existing array index '0'"}
set exclude_input_in_output true it will core dumped, both emsemble mode and single tensorrt_llm
tritonserver build from master 2023/12/18
| server_version | 2.42.0dev
tensorrt-llm build from master newest
QWEN model, tested it with python run.py it works ok without coredump,tp=1,It seems to be a trtllm_backend problem
The text was updated successfully, but these errors were encountered:
Can you follow this template to provide the concrete steps of reproducing your issue? It is easier for our engineers to investigate it and provide the help.
I re-encounter this problem again using the image compiled from main branch on 20240126, the problem is exactly the same. under ensemble mode, and set exclude_input_in_output false, while model gnerate nothing, it returns :
{"error":"attempt to access non-existing array index '0'"}
I investigate this problem and I do not think this error came from the the postprocessing module in ensemble mode.
this error occurs only when model gnerate nothing.
set exclude_input_in_output true will core dumped while model gnerate nothing
while set exclude_input_in_output false, it works ok without ensemble.
under ensemble mode, and set exclude_input_in_output false, it returns :
set exclude_input_in_output true it will core dumped, both emsemble mode and single tensorrt_llm
tritonserver build from master 2023/12/18
tensorrt-llm build from master newest
QWEN model, tested it with python run.py it works ok without coredump,tp=1,It seems to be a trtllm_backend problem
The text was updated successfully, but these errors were encountered: