DeepStack February 2021 Release - Fixes for Jetson, Windows Native and Docker Versions

Hello everyone, We are happy to share the latest versions of DeepStack with a number of fixes to address some of the issues raised so far. See details below.

This release fixes the following issues in DeepStack across all versions

  • Jetson Error on Jetpack 4.5
  • Scene Recognition onnxruntime bug on Windows Native Versions
  • Face Register Not working
  • Requests taking forever. Failed requests due to internal errors now times out after 1 min with 500 error code
  • Fixed Issue with Zero Min Confidence causing DeepStack to hang, all min_confidence are forced to 0.1 minimum

Install the Latest Versions of DeepStack to take advantage of this update.

Jetson Version:
deepquestai/deepstack:jetpack-2021.02.1 and deepquestai/deepstack:jetpack

Download Windows CPU Edition
Download Windows GPU Edition

Docker CPU Version:
deepquestai/deepstack:cpu-2021.02.1 and deepquestai/deepstack

Docker GPU Version:
deepquestai/deepstack:gpu-2021.02.1 and deepquestai/deepstack:gpu

Thank you all for your incredible patience with us over the past weeks as we work on making DeepStack better for you.

We are aware there are issues not addressed in this update, including

  • bugs while running multiple instances with the Windows Native edition
  • Raspberry Support
  • Some performance and accuracy issues
  • Improvements to custom training
  • General ARM64 Support

Note that we are working on this things and will release new updates Next Month to address some of them.

On the highly requested Requested RPI + NCS update, we are working on the best path torwards this, as it has been our policy to ensure that all new versions of DeepStack have the same features and are at present based on a common codebase, the RPI + NCS version has been a challenging one due to limitations of the intel NCS devices and high compute constraints on the RPI. When finally released, it will required a different codebase of its own and will not support custom models due to limitations of the NCS. At the moment, we recommend using the Nvidia Jetson for edge cases as it is features much better hardware and software stacks that enables running the full set of DeepStack features efficiently.

Thank you so much for your patience, please report any bugs or issues here and we shall make the best effort to fix them.

7 Likes

hi John,
unfortunately no luck for me as it still hangs without any response for /v1/vision/face calls, however /v1/vision/detection calls are successfully running.

i’m using docker on my Synology NAS and using below image to run the container…

Docker CPU Version:
deepquestai/deepstack:cpu-2021.02.1 and deepquestai/deepstack

I just upgraded my Jetson and am getting this in the /app/logs/stderr.txt file:

Process Process-1:
Traceback (most recent call last):
  File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/app/intelligencelayer/shared/detection.py", line 69, in objectdetection
    detector = YOLODetector(model_path, reso, cuda=CUDA_MODE)
  File "/app/intelligencelayer/shared/./process.py", line 36, in __init__
    self.model = attempt_load(model_path, map_location=self.device)
  File "/app/intelligencelayer/shared/./models/experimental.py", line 159, in attempt_load
    torch.load(w, map_location=map_location)["model"].float().fuse().eval()
  File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 584, in load
    return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
  File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 842, in _load
    result = unpickler.load()
  File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 834, in persistent_load
    load_tensor(data_type, size, key, _maybe_decode_ascii(location))
  File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 823, in load_tensor
    loaded_storages[key] = restore_location(storage, location)
  File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 803, in restore_location
    return default_restore_location(storage, str(map_location))
  File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 174, in default_restore_location
    result = fn(storage, location)
  File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 150, in _cuda_deserialize
    device = validate_cuda_device(location)
  File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 134, in validate_cuda_device
    raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

And making calls to /v1/vision/detection returns a 500 error after 1 minute.

Did I miss something? It was working fine on the previous version.

Sorry to hear that @dearlk, kindly share the logs from your run of the face detection. Once you have deepstack started with face detection and making calls to it. Run

sudo docker exec -it container-name /bin/bash
once in the container, run
cat ../logs/stderr.txt

Please share the contents of this file to enable us debug it

Hello @mdewyer, it appears the gpu is not enabled while you ran it, did you run with
--runtime nvidia ?

hi @john

this is what i see in stderr.txt file…

looks like error is …

exit status 1
chdir intelligencelayer\shared: The system cannot find the path specified.

I tried few requests for face detection and all failed after 1m0s with timeout error…However this time i saw error from console after request timeout as below

{'success': False, 'error': 'failed to process request before timeout', 'duration': 0}
                                                                                 

as you can see vision detection call is returning results in 3+s while face detection is failing after 1m0s with timeout error.

my NAS has got good 10GB memory and CPU utilization is also not reaching 100% while request is running.

Hi @john,

Unfortunately it’s still not working :frowning:

I’m starting it with the command:

sudo docker run -t -d --runtime nvidia -e VISION-DETECTION=True -p 80:5000 deepquestai/deepstack:jetpack

I’ve deleted the container & image and started from scratch a few times and still get the same error.

The logs look like:

DeepStack: Version 2021.02.1
/v1/vision/detection
---------------------------------------
---------------------------------------
v1/backup
---------------------------------------
v1/restore
[GIN] 2021/02/09 - 14:48:24 | 500 |          1m0s |        10.0.4.9 | POST     /v1/vision/detection
[GIN] 2021/02/09 - 14:49:50 | 500 |          1m0s |        10.0.4.9 | POST     /v1/vision/detection

And the response back from the request is:

success: false
error: "failed to process request before timeout"
duration: 0

Thanks for your help and all your hard work on this project!

1 Like

Thanks for sharing these logs. Can you confirm the version of DeepStack you are running,

When you start DeepStack, you will find something like this printed out DeepStack: Version 2021.02.1

Please share this.

Looking at this logs, it is not supposed to occur in the latest Docker version, i just ran it with face again on my end and it works. Do confirm your version and let me know.

Thanks a lot @mdewyer,
What version of the nvidia jetpack are you running?

There are some incompatibility between different jetpack versions, if you are running jetpack 4.5, this should work.

hi @john,
i do see version as 2021.02.1 as below

Hey @john,

Oh interesting… here’s what I get:

> apt show nvidia-jetpack
Package: nvidia-jetpack
Version: 4.4.1-b50
Priority: standard
Section: metapackages
Maintainer: NVIDIA Corporation
Installed-Size: 199 kB
Depends: nvidia-cuda (= 4.4.1-b50), nvidia-opencv (= 4.4.1-b50), nvidia-cudnn8 (= 4.4.1-b50), nvidia-tensorrt (= 4.4.1-b50), nvidia-visionworks (= 4.4.1-b50), nvidia-container (= 4.4.1-b50), nvidia-vpi (= 4.4.1-b50), nvidia-l4t-jetson-multimedia-api (>> 32.4-0), nvidia-l4t-jetson-multimedia-api (<< 32.5-0)
Homepage: http://developer.nvidia.com/jetson
Download-Size: 29.4 kB
APT-Sources: https://repo.download.nvidia.com/jetson/t210 r32.4/main arm64 Packages
Description: NVIDIA Jetpack Meta Package

So it appears my Jetson is on Jetpack 4.4. Might that be the issue?

Gonna try to find details on how to upgrade and give that a shot.

If my Docker container is calling “deepquestai/deepstack:latest” when I start it, will it be running this version? Thanks!

I found the steps to upgrade to Jetpack 4.5 here: https://docs.nvidia.com/jetson/l4t/index.html#page/Tegra%20Linux%20Driver%20Package%20Development%20Guide/updating_jetson_and_host.html#wwpID0E06B0HA

Follow the steps under “To upgrade to a new minor release.” Basically, you need to edit /etc/apt/sources.list.d/nvidia-l4t-apt-source.list, changing 32.4 to 32.5 in both places. Then apt update, apt dist-upgrade.

1 Like

Wanted to say thanks for this. I have the non-docker GPU running great. Complete times from AI is sub 100ms. i9-9900k and 1050TI vid card. Blue Iris is using the igpu. Between both the igpu and 1050ti, my work load on the system barely registers with this setup.

@john any feedback on this please? still not working

Thank you very much for the simplified instructions! I finally got around to upgrading the Jetson and all is working again :slight_smile:

I just registered here to report the same. I am trying to run this on three different platforms; a Jetson Nano 4gb, plus a pair of 10 core i-9 servers. Object detection works great on the two servers with the CPU build, face detection does not work on any. In fact nothing works on the Jetson. The symptoms are universal… All three platforms timeout after 1 minute. I don’t notice any side effects on my servers however on the Jetson, when the —runtime nvidia is specified I see two python processes eat up 50% of CPU while kswapd uses up the rest. This condition lasts for a few minutes before settling down a bit but kswapd uses a significant portion of the processor.

On the Jetson I also noted these lines in the stderror log;

exit status 1
chdir intelligencelayer\shared: The system cannot find the path specified.

I’m at a standstill on the Jetson at the moment. Any suggestions?

And for the CPU version, how do I prevent the timeouts?

Does this Windows cpu version works with VorlonCD so-tools

Hi @john
Hope I am not hijacking the thread, but I am seeing the same error on an Ubuntu docker install. Object detection works fine, but face detection or recognition gives a 500 error with the stderr.txt reading:

exit status lchdir intelligencelayer\shared: The system cannot find the path specified.

Am I missing something in it? I am trying to send the images via the home assistant integration.