DeepStack Release on Jetson

Hello everyone. We are excited to share the release of DeepStack GPU Version on the Nvidia Jetson. With support for the full range of Jetson Devices, from the 2GB Nano edition to the higher end jetson devices.

This supports the full spectrum of DeepStack features.
You can run DeepStack on the Jetson with the command below.

sudo docker run --runtime nvidia -e VISION-DETECTION=True -p 80:5000 deepquestai/deepstack:jetpack-x1-beta

To run with the face apis, simply use -e VISION-FACE=True instead, for scene, use -e VISION-SCENE=True.

We are super excited to finally bring a stable version of DeepStack that runs on ARM64. We strongly recommend using this over the Raspberry + NCS version, as it is faster, more stable and the Jetson Nano is also less costly than the RPI + NCS combination.

We are working towards full open source release before December, With support for custom detection models and the Windows Native Edition all scheduled for this week.

Thanks for all your feedbacks, we are excited to build the future AI platform with you all.

7 Likes

Happy to report that I installed deepstack on my 4GB jetson nano. Object detection seems to work fine, with average processing time of 130 ms, or 7 fps consuming 30% and 60% of my nano’s CPU and memory respectively.

Face recognition works partially. I was successful training one face with 10 photos, but 100 or 250 photos seems to hang deepstack infinitely. I waited over an hour registering 250 photos (something that takes 2 minutes 30 seconds on RPI+NCS2) with no luck. During the hour of waiting, deepstack responded to face/list command. Memory utilization exceeded 95% for few minutes then dropped to 15%. CPU stayed around 30% most of the time.

Is there perhaps any dependency that I’m missing?

Hello, thanks for trying this out. Note that, you do not need to register too many faces for the facial recognition to work.
On the performance, i will investigate this. I am curious, what version of RPI did you use to register the 250 photos. Also, is this 250 photos in one single request or spread over multiple requests for multiple faces?

Note, you are not missing any dependency as the DeepStack container ships all its dependendencies.

Finally, are you running with both detection and face enabled at the same time ?

@john, I’m using RPi4 with 8GB of ram with NCS2 stick. The 250 photos are submitted in a single request as in RPI4+NCS2 setup subsequent register requests overwrite pictures submitted earlier:

I tested object detection and face recognition separately, one after another.

Just checked that on Jetson new pictures overwrite (erase) pictures submitted earlier, just as on RPi+NCS2.

Additionally, the /v1/vision/face endpoint returns nothing.

Got it running but just wondering how to get it to auto start on boot up?

@sickidolderivative
Please run

deepquestai/deepstack:jetpack-x2-beta 

Note that overriding the registered images is by design. And this happens only when you are registering for an id you already registered prior.
Also, registering about 10 pictures should more than suffice for face recognition.

One important thing i have observed is that, at present, if you run deepstack on the jetson and then rerun it, requests might get slower due to the gpu not having unloaded from the previous run.

I recommend rebooting your jetson if you need to stop and rerun DeepStack.

2 Likes

@Tinbum, you can do this by running with

--restart unless-stopped 

Example:

sudo docker run --runtime nvidia --restart unless-stopped  -e VISION-DETECTION=True -p 80:5000 deepquestai/deepstack:jetpack-x1-beta

@john, many thanks for the quick update and hints. I’ve tried different number of pictures and found that ~20 is the maximum jetson can handle. Anything beyond that number results in deepstack freeze or even OS being unresponsive. I also noticed that I need to restart deepstack and OS between subsequent registrations, otherwise they become unresponsive.

Nevertheless, I’m extremely happy to confirm that despite fewer pictures face recognition works great. Compared to RPi+NCS2 there’s great improvement in accuracy and speed. Confidence given to recognitions seems much more sound compared to that reported by RPi. Face recognition takes about 360 ms, or 2.7 fps.

Overall, so far I’m really happy with the move to jetson. Thanks so much for your great work.

1 Like

@john, Thank you. Its working well but I’m not finding times particularly quick.

[GIN] 2020/11/25 - 22:22:35 | 200 | 431.337959ms | 192.168.3.10 | POST /v1/vision/detection
[GIN] 2020/11/25 - 22:30:49 | 200 | 499.813056ms | 192.168.3.10 | POST /v1/vision/detection
[GIN] 2020/11/25 - 22:31:49 | 200 | 466.548928ms | 192.168.3.10 | POST /v1/vision/detection
[GIN] 2020/11/25 - 21:04:30 | 200 | 780.895562ms | 192.168.3.10 | POST /v1/vision/detection
[GIN] 2020/11/25 - 21:11:29 | 200 | 449.209933ms | 192.168.3.10 | POST /v1/vision/detection
[GIN] 2020/11/25 - 21:13:15 | 200 | 449.908885ms | 192.168.3.10 | POST /v1/vision/detection
[GIN] 2020/11/25 - 21:17:55 | 200 | 459.805675ms | 192.168.3.10 | POST /v1/vision/detection
[GIN] 2020/11/25 - 21:19:13 | 200 | 571.942161ms | 192.168.3.10 | POST /v1/vision/detection
[GIN] 2020/11/25 - 21:20:43 | 200 | 478.488637ms | 192.168.3.10 | POST /v1/vision/detection
[GIN] 2020/11/25 - 21:22:15 | 200 | 611.136033ms | 192.168.3.10 | POST /v1/vision/detection
[GIN] 2020/11/25 - 21:26:22 | 200 | 441.431052ms | 192.168.3.10 | POST /v1/vision/detection
[GIN] 2020/11/25 - 21:26:52 | 200 | 476.475613ms | 192.168.3.10 | POST /v1/vision/detection
[GIN] 2020/11/25 - 21:26:56 | 200 | 439.551026ms | 192.168.3.10 | POST /v1/vision/detection
[GIN] 2020/11/25 - 21:28:58 | 200 | 452.356151ms | 192.168.3.10 | POST /v1/vision/detection
[GIN] 2020/11/25 - 21:30:37 | 200 | 519.156956ms | 192.168.3.10 | POST /v1/vision/detection
[GIN] 2020/11/25 - 21:31:24 | 200 | 483.070601ms | 192.168.3.10 | POST /v1/vision/detection
[GIN] 2020/11/25 - 21:33:10 | 200 | 435.309626ms | 192.168.3.10 | POST /v1/vision/detection
[GIN] 2020/11/25 - 21:34:40 | 200 | 473.803156ms | 192.168.3.10 | POST /v1/vision/detection
[GIN] 2020/11/25 - 21:37:06 | 200 | 523.369175ms | 192.168.3.10 | POST /v1/vision/detection
[GIN] 2020/11/25 - 21:41:48 | 200 | 432.428982ms | 192.168.3.10 | POST /v1/vision/detection
[GIN] 2020/11/25 - 22:19:07 | 200 | 765.142751ms | 192.168.3.10 | POST /v1/vision/detection
[GIN] 2020/11/25 - 22:19:11 | 200 | 524.709789ms | 192.168.3.10 | POST /v1/vision/detection
[GIN] 2020/11/25 - 22:19:17 | 200 | 512.808373ms | 192.168.3.10 | POST /v1/vision/detection
[GIN] 2020/11/25 - 22:20:50 | 200 | 631.438957ms | 192.168.3.10 | POST /v1/vision/detection
[GIN] 2020/11/25 - 22:21:25 | 200 | 294.773745ms | 192.168.3.10 | POST /v1/vision/detection
[GIN] 2020/11/25 - 22:21:45 | 200 | 464.295161ms | 192.168.3.10 | POST /v1/vision/detection
[GIN] 2020/11/25 - 22:22:35 | 200 | 431.337959ms | 192.168.3.10 | POST /v1/vision/detection
[GIN] 2020/11/25 - 22:30:49 | 200 | 499.813056ms | 192.168.3.10 | POST /v1/vision/detection
[GIN] 2020/11/25 - 22:31:49 | 200 | 466.548928ms | 192.168.3.10 | POST /v1/vision/detection

@Tinbum those times look pretty typical on a Nano. What are you comparing to?

Object detection takes ~130 ms on my 4GB Jetson nano. Which nano do you use @Tinbum?

@robmarkcole times posted by @sickidolderivative

@sickidolderivative same as you. 4GB

@Tinbum, I would suspect your processing time has to do with the size of images you submit to deepstack for object detection. The processing time of ~130 ms is for slices between 300x300 and 500x500 pixels. Processing of full 3072x2048 frame takes north of 750 ms. I’d suggest slicing your frame to reasonable area before posting to deepstack. Hope that helps.

2 Likes

Hello @Tinbum @sickidolderivative @robmarkcole

To address some the performance issues observed with the face apis, we have released a new update with a focus on lower memory usage and faster inference for all the face endpoints.

run

deepquestai/deepstack:jetpack-x3-beta

Thanks for all the feedback on this. We are looking forward to hearing any further bugs or bottlenecks as we perfect this towards stable release.

2 Likes

@sickidolderivative Thank you, yes my images are larger. Compared with the CPU version I am also running Jetson version takes about double the time.
[GIN] 2020/11/27 - 09:43:24 | 200 | 245.5242ms | 172.17.0.1 | POST /v1/vision/detection

[GIN] 2020/11/27 - 09:43:53 | 200 | 245.9533ms | 172.17.0.1 | POST /v1/vision/detection

[GIN] 2020/11/27 - 09:43:57 | 200 | 265.1079ms | 172.17.0.1 | POST /v1/vision/detection

[GIN] 2020/11/27 - 09:44:27 | 200 | 247.1615ms | 172.17.0.1 | POST /v1/vision/detection

[GIN] 2020/11/27 - 09:44:30 | 200 | 297.2009ms | 172.17.0.1 | POST /v1/vision/detection

[GIN] 2020/11/27 - 09:44:33 | 200 | 355.1263ms | 172.17.0.1 | POST /v1/vision/detection

[GIN] 2020/11/27 - 09:44:36 | 200 | 484.0866ms | 172.17.0.1 | POST /v1/vision/detection

[GIN] 2020/11/27 - 09:44:52 | 200 | 309.5168ms | 172.17.0.1 | POST /v1/vision/detection

[GIN] 2020/11/27 - 09:44:56 | 200 | 207.9694ms | 172.17.0.1 | POST /v1/vision/detection

[GIN] 2020/11/27 - 09:45:02 | 200 | 299.9153ms | 172.17.0.1 | POST /v1/vision/detection

[GIN] 2020/11/27 - 09:45:48 | 200 | 298.4713ms | 172.17.0.1 | POST /v1/vision/detection

[GIN] 2020/11/27 - 09:45:49 | 200 | 270.6333ms | 172.17.0.1 | POST /v1/vision/detection

[GIN] 2020/11/27 - 09:45:51 | 200 | 284.0541ms | 172.17.0.1 | POST /v1/vision/detection

[GIN] 2020/11/27 - 09:45:54 | 200 | 452.6494ms | 172.17.0.1 | POST /v1/vision/detection

[GIN] 2020/11/27 - 09:46:29 | 200 | 302.0107ms | 172.17.0.1 | POST /v1/vision/detection

[GIN] 2020/11/27 - 09:46:59 | 200 | 298.9392ms | 172.17.0.1 | POST /v1/vision/detection

[GIN] 2020/11/27 - 09:47:01 | 200 | 285.6112ms | 172.17.0.1 | POST /v1/vision/detection

Has the native window edition been released? And will this support Nvidea GPU?

I have been watching the forums since it was mentioned, but haven’t seen the announcement yet. Did i miss it?

1 Like

Hello @shallowstack, the windows version will be released before Christmas.
Stay tuned.
And yes, Nvidia GPU will be fully supported.

3 Likes