GPU v1/vision/detection Unresponsive

I’m running deepstack:gpu-x5-beta on an Intel i7-4790 with 8GB of RAM. The GPU is an Nvidia GT 760.
When I run the command: curl -X POST -F image=@test-image3.jpg ‘http://localhost:83/v1/vision/scene’, I get a reply. But when I run the command: curl -X POST -F image=@test-image3.jpg ‘http://localhost:83/v1/vision/detection’, it just hangs forever.
When I run the cpu version it works just fine.
Any ideas about what’s going on?

My docker-compose file:
version: “3.3”
image: deepquestai/deepstack:gpu-x5-beta
restart: unless-stopped
container_name: DeepStack_GPU_X5
- “83:5000”

 - localstorage:/datastore


Gummybear, did you get this resolved? I am having same issue.

Unfortunately, I did not. I tried 3 different GPUs, on Debian and Ubuntu. Never got it to work.

Hello @gummybear @malakipaa

Do run this with the latest deepstack versions, deepquestai/deepstack:gpu-2020.12

try to run this again and if the issue occurs, get the logs and share, you can DeepStack by doing this.

step 1. get the name of the deepstack container by running sudo docker ps
step 2. run sudo docker exec -it deepstack-container-name /bin/bash
step 3: install nano in the container by running apt-get install nano
step 4: get the error logs by running nano ../logs/stderr.txt

Whatever went wrong will be right there

I have same problem, CPU version works but GPU versions doesn’t. I’m using latest version (deepquestai/deepstack:gpu-2020.12)
From stderr.txt can be found following:
Process Process-1:
Traceback (most recent call last):
File “/usr/lib/python3.7/multiprocessing/”, line 297, in _bootstrap
File “/usr/lib/python3.7/multiprocessing/”, line 99, in run
self._target(*self._args, **self._kwargs)
File “/app/intelligencelayer/shared/”, line 62, in objectdetection
detector = YOLODetector(model_path,reso,cuda=CUDA_MODE)
File “/app/intelligencelayer/shared/”, line 30, in init
self.model = attempt_load(model_path, map_location=self.device)
File “/app/intelligencelayer/shared/models/”, line 137, in attempt_load
model.append(torch.load(w, map_location=map_location)[‘model’].float().fuse().eval()) # load FP32 model
File “/usr/local/lib/python3.7/dist-packages/torch/”, line 584, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File “/usr/local/lib/python3.7/dist-packages/torch/”, line 842, in _load
result = unpickler.load()
File “/usr/local/lib/python3.7/dist-packages/torch/”, line 834, in persistent_load
load_tensor(data_type, size, key, _maybe_decode_ascii(location))
File “/usr/local/lib/python3.7/dist-packages/torch/”, line 823, in load_tensor
loaded_storages[key] = restore_location(storage, location)
File “/usr/local/lib/python3.7/dist-packages/torch/”, line 803, in restore_location
return default_restore_location(storage, str(map_location))
File “/usr/local/lib/python3.7/dist-packages/torch/”, line 174, in default_restore_location
result = fn(storage, location)
File “/usr/local/lib/python3.7/dist-packages/torch/”, line 150, in _cuda_deserialize
device = validate_cuda_device(location)
File “/usr/local/lib/python3.7/dist-packages/torch/”, line 134, in validate_cuda_device
raise RuntimeError('Attempting to deserialize object on a CUDA ’
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device(‘cpu’) to map your storages to the CPU.

Hi John,

Thank you for the help.

I finally got it to work linuxmint on non avx cpu vm but response time for detection is slow. Averages are 1 sec. Its too slow in my opinion so…

I tried to setup a windows 10 gpu deepstack to try and experiment on vmware 6.5 vm with xeon x5675 and gtx750 passthrough to the vm. I followed all the procedure to install 2 nvidia tools and drivers and the deepstack for gpu on windows and everything got installed successfully. However it doesnt work and windows 10 deepstack received the request from aitools but doesnt process them.

In my googling I couldn’t find any troubleshooting info but my guts tells me its because my xeon x5675 doesnt have avx support even if the vm has passthrough gtx750 so the windows 10 deepstack vm wont have too so it wont process requests. Also maybe because gtx750 is too old that it doesnt support deepstack gpu version?

Any advice to try?

What works for me in linux is running the command to start deepstack is this

sudo docker run -e VISION-DETECTION=True -v localstorage:/datastore -p 80:5000 deepquestai/deepstack:latest

Note: VISION-DETECTION=True is not any instruction I have read for linux. But this command is used in windows version of deepstack. I dont know why it works. Hope it works for you

Found out that GPU cannot be used in docker container when using docker compose for generating the it. That is the reason why it fails, like error said:
torch.cuda.is_available() is False

But when docker container will be installed from command line, everything works like charm:
sudo docker run --restart=always --name=deepstack --gpus all -e MODE=High -e VISION-DETECTION=True -d -v /opt/deepstack:/datastore -p 5000:5000 deepquestai/deepstack:gpu

I don’t know how to use --gpus all with docker compose, if it is even possible.

I finally ran John’s instructions and found out that the pytorch version (I don’t actually know what that is) of my card is 3.0. The minimum version is 3.5. So I guess that’s that until I get a different card.