Docker Deepstack GPU Vision Timeouts

Hello,

I am trying to get the docker GPU running within Docker on Unraid, and while it appears that Deepstack is activated correctly, my object detections are running into timeout. The GPU is a GT1030, Unraid Version 6.9.2. Logs are attached.

Nvidia SMI:

Blockquote
Fri Oct 1 15:28:39 2021
±----------------------------------------------------------------------------+
| NVIDIA-SMI 470.74 Driver Version: 470.74 CUDA Version: 11.4 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce … Off | 00000000:01:00.0 Off | N/A |
| 37% 37C P0 N/A / 30W | 0MiB / 2001MiB | 1% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+


Blockquoteroot@UNRAID:~# sudo docker exec -it DeepstackGPUOfficial /bin/bash
root@ed10552468a7:/app/server# cat …/logs/stderr.txt
Process Process-1:
Traceback (most recent call last):
File “/usr/lib/python3.7/multiprocessing/process.py”, line 297, in _bootstrap
self.run()
File “/usr/lib/python3.7/multiprocessing/process.py”, line 99, in run
self._target(*self._args, **self._kwargs)
File “/app/intelligencelayer/shared/detection.py”, line 69, in objectdetection
detector = YOLODetector(model_path, reso, cuda=CUDA_MODE)
File “/app/intelligencelayer/shared/./process.py”, line 36, in init
self.model = attempt_load(model_path, map_location=self.device)
File “/app/intelligencelayer/shared/./models/experimental.py”, line 159, in attempt_load
torch.load(w, map_location=map_location)[“model”].float().fuse().eval()
File “/usr/local/lib/python3.7/dist-packages/torch/serialization.py”, line 584, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File “/usr/local/lib/python3.7/dist-packages/torch/serialization.py”, line 842, in _load
result = unpickler.load()
File “/usr/local/lib/python3.7/dist-packages/torch/serialization.py”, line 834, in persistent_load
load_tensor(data_type, size, key, _maybe_decode_ascii(location))
File “/usr/local/lib/python3.7/dist-packages/torch/serialization.py”, line 823, in load_tensor
loaded_storages[key] = restore_location(storage, location)
File “/usr/local/lib/python3.7/dist-packages/torch/serialization.py”, line 803, in restore_location
return default_restore_location(storage, str(map_location))
File “/usr/local/lib/python3.7/dist-packages/torch/serialization.py”, line 174, in default_restore_location
result = fn(storage, location)
File “/usr/local/lib/python3.7/dist-packages/torch/serialization.py”, line 156, in _cuda_deserialize
return obj.cuda(device)
File “/usr/local/lib/python3.7/dist-packages/torch/_utils.py”, line 77, in cuda
return new_type(self.size()).copy
(self, non_blocking)
File “/usr/local/lib/python3.7/dist-packages/torch/cuda/init.py”, line 480, in _lazy_new
return super(_CudaBase, cls).new(cls, *args, **kwargs)
RuntimeError: CUDA error: out of memory
Process Process-1:
Traceback (most recent call last):
File “/usr/lib/python3.7/multiprocessing/process.py”, line 297, in _bootstrap
self.run()
File “/usr/lib/python3.7/multiprocessing/process.py”, line 99, in run
self._target(*self._args, **self._kwargs)
File “/app/intelligencelayer/shared/detection.py”, line 69, in objectdetection
detector = YOLODetector(model_path, reso, cuda=CUDA_MODE)
File “/app/intelligencelayer/shared/./process.py”, line 36, in init
self.model = attempt_load(model_path, map_location=self.device)
File “/app/intelligencelayer/shared/./models/experimental.py”, line 159, in attempt_load
torch.load(w, map_location=map_location)[“model”].float().fuse().eval()
File “/usr/local/lib/python3.7/dist-packages/torch/serialization.py”, line 584, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File “/usr/local/lib/python3.7/dist-packages/torch/serialization.py”, line 842, in _load
result = unpickler.load()
File “/usr/local/lib/python3.7/dist-packages/torch/serialization.py”, line 834, in persistent_load
load_tensor(data_type, size, key, _maybe_decode_ascii(location))
File “/usr/local/lib/python3.7/dist-packages/torch/serialization.py”, line 823, in load_tensor
loaded_storages[key] = restore_location(storage, location)
File “/usr/local/lib/python3.7/dist-packages/torch/serialization.py”, line 803, in restore_location
return default_restore_location(storage, str(map_location))
File “/usr/local/lib/python3.7/dist-packages/torch/serialization.py”, line 174, in default_restore_location
result = fn(storage, location)
File “/usr/local/lib/python3.7/dist-packages/torch/serialization.py”, line 156, in _cuda_deserialize
return obj.cuda(device)
File “/usr/local/lib/python3.7/dist-packages/torch/_utils.py”, line 77, in cuda
return new_type(self.size()).copy
(self, non_blocking)
File “/usr/local/lib/python3.7/dist-packages/torch/cuda/init.py”, line 480, in _lazy_new
return super(_CudaBase, cls).new(cls, *args, **kwargs)
RuntimeError: CUDA error: out of memory
Process Process-1:
Traceback (most recent call last):
File “/usr/lib/python3.7/multiprocessing/process.py”, line 297, in _bootstrap
self.run()
File “/usr/lib/python3.7/multiprocessing/process.py”, line 99, in run
self._target(*self._args, **self._kwargs)
File “/app/intelligencelayer/shared/detection.py”, line 69, in objectdetection
detector = YOLODetector(model_path, reso, cuda=CUDA_MODE)
File “/app/intelligencelayer/shared/./process.py”, line 36, in init
self.model = attempt_load(model_path, map_location=self.device)
File “/app/intelligencelayer/shared/./models/experimental.py”, line 159, in attempt_load
torch.load(w, map_location=map_location)[“model”].float().fuse().eval()
File “/usr/local/lib/python3.7/dist-packages/torch/serialization.py”, line 584, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File “/usr/local/lib/python3.7/dist-packages/torch/serialization.py”, line 842, in _load
result = unpickler.load()
File “/usr/local/lib/python3.7/dist-packages/torch/serialization.py”, line 834, in persistent_load
load_tensor(data_type, size, key, _maybe_decode_ascii(location))
File “/usr/local/lib/python3.7/dist-packages/torch/serialization.py”, line 823, in load_tensor
loaded_storages[key] = restore_location(storage, location)
File “/usr/local/lib/python3.7/dist-packages/torch/serialization.py”, line 803, in restore_location
return default_restore_location(storage, str(map_location))
File “/usr/local/lib/python3.7/dist-packages/torch/serialization.py”, line 174, in default_restore_location
result = fn(storage, location)
File “/usr/local/lib/python3.7/dist-packages/torch/serialization.py”, line 156, in _cuda_deserialize
return obj.cuda(device)
File “/usr/local/lib/python3.7/dist-packages/torch/_utils.py”, line 77, in cuda
return new_type(self.size()).copy
(self, non_blocking)
File “/usr/local/lib/python3.7/dist-packages/torch/cuda/init.py”, line 480, in _lazy_new
return super(_CudaBase, cls).new(cls, *args, **kwargs)
RuntimeError: CUDA error: out of memory
Process Process-2:
Traceback (most recent call last):
File “/usr/lib/python3.7/multiprocessing/process.py”, line 297, in _bootstrap
self.run()
File “/usr/lib/python3.7/multiprocessing/process.py”, line 99, in run
self._target(*self._args, **self._kwargs)
File “/app/intelligencelayer/shared/face.py”, line 73, in face
cuda=SharedOptions.CUDA_MODE,
File “/app/intelligencelayer/shared/./recognition/process.py”, line 31, in init
self.model = self.model.cuda()
File “/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py”, line 458, in cuda
return self._apply(lambda t: t.cuda(device))
File “/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py”, line 354, in _apply
module._apply(fn)
File “/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py”, line 354, in _apply
module._apply(fn)
File “/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py”, line 376, in _apply
param_applied = fn(param)
File “/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py”, line 458, in
return self._apply(lambda t: t.cuda(device))
RuntimeError: CUDA error: out of memory
Process Process-1:
Traceback (most recent call last):
File “/usr/lib/python3.7/multiprocessing/process.py”, line 297, in _bootstrap
self.run()
File “/usr/lib/python3.7/multiprocessing/process.py”, line 99, in run
self._target(*self._args, **self._kwargs)
File “/app/intelligencelayer/shared/scene.py”, line 65, in scenerecognition
SharedOptions.CUDA_MODE,
File “/app/intelligencelayer/shared/scene.py”, line 38, in init
self.model = self.model.cuda()
File “/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py”, line 458, in cuda
return self._apply(lambda t: t.cuda(device))
File “/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py”, line 354, in _apply
module._apply(fn)
File “/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py”, line 376, in _apply
param_applied = fn(param)
File “/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py”, line 458, in
return self._apply(lambda t: t.cuda(device))
RuntimeError: CUDA error: out of memory
Process Process-1:
Traceback (most recent call last):
File “/usr/lib/python3.7/multiprocessing/process.py”, line 297, in _bootstrap
self.run()
File “/usr/lib/python3.7/multiprocessing/process.py”, line 99, in run
self._target(*self._args, **self._kwargs)
File “/app/intelligencelayer/shared/detection.py”, line 69, in objectdetection
detector = YOLODetector(model_path, reso, cuda=CUDA_MODE)
File “/app/intelligencelayer/shared/./process.py”, line 36, in init
self.model = attempt_load(model_path, map_location=self.device)
File “/app/intelligencelayer/shared/./models/experimental.py”, line 159, in attempt_load
torch.load(w, map_location=map_location)[“model”].float().fuse().eval()
File “/usr/local/lib/python3.7/dist-packages/torch/serialization.py”, line 584, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File “/usr/local/lib/python3.7/dist-packages/torch/serialization.py”, line 842, in _load
result = unpickler.load()
File “/usr/local/lib/python3.7/dist-packages/torch/serialization.py”, line 834, in persistent_load
load_tensor(data_type, size, key, _maybe_decode_ascii(location))
File “/usr/local/lib/python3.7/dist-packages/torch/serialization.py”, line 823, in load_tensor
loaded_storages[key] = restore_location(storage, location)
File “/usr/local/lib/python3.7/dist-packages/torch/serialization.py”, line 803, in restore_location
return default_restore_location(storage, str(map_location))
File “/usr/local/lib/python3.7/dist-packages/torch/serialization.py”, line 174, in default_restore_location
result = fn(storage, location)
File “/usr/local/lib/python3.7/dist-packages/torch/serialization.py”, line 156, in _cuda_deserialize
return obj.cuda(device)
File “/usr/local/lib/python3.7/dist-packages/torch/_utils.py”, line 77, in cuda
return new_type(self.size()).copy
(self, non_blocking)
File “/usr/local/lib/python3.7/dist-packages/torch/cuda/init.py”, line 480, in _lazy_new
return super(_CudaBase, cls).new(cls, *args, **kwargs)
RuntimeError: CUDA error: out of memory

DeepStack: Version 2021.09.01

v1/vision/custom/dark

v1/vision/custom/poolcam

v1/vision/custom/unagi
/v1/vision/face

/v1/vision/face/recognize

/v1/vision/face/register

/v1/vision/face/match

/v1/vision/face/list

/v1/vision/face/delete

/v1/vision/detection

/v1/vision/scene


p

v1/restore

Timeout Log:

[GIN] 2021/10/01 - 19:46:30 | 500 | 1m0s | 54.86.50.139 | POST “/v1/vision/detection”
[GIN] 2021/10/01 - 19:46:30 | 500 | 1m0s | 54.86.50.139 | POST “/v1/vision/detection”

Solved:

This was a CUDA memory issue created by attempting to run too many custom models. I would suggest some sort of warning, if possible in future versions to let users know if they hit this wall as it “looks” like the docker is running when it is in fact not in this circumstance.