Performance & Benchmark

Is it possible to have bench by architecture :slight_smile:

Here my results with 3840x2160 picture

  • RPI4+ncs2 => 600ms
  • intel i7700 + Deepstack 202012 Win 10 beta CPU mode => 300ms
  • intel i7700 + GeForce GTX1060 + Deepstack 202012 Win 10 beta GPU mode => 55ms
2 Likes

Nice, which service did you use?

1 Like

This is an interesting conversation. :grinning: Thanks for sharing this result @vlitkowski .

I encourage others to share test results on their hardware setup as well.

We will love to document these results for others in the community to use as reference.

(2MP images)

  • Intel Xeon W3530 + Deepstackai:noavx => 1.4s
  • Intel Xeon W3530 + Deepstack 202012 Win 10 beta CPU mode (with AITool) => 800ms
  • Nvidia Quadro K2000 + Deepstack 202012 Win 10 beta GPU mode => PyTorch version doesn’t support GPU

***edit-

  • Intel Xeon W3530 + Deepstack 202012 Win 10 beta CPU mode (using python script and a custom, single class, “yolov5s” deepstack model) => 350ms
1 Like

Deepstack 202012 beta GPU
Server 2019 with a P2000
1920x1280 @ 80ms
3840x2160 @ 180ms

1 Like

Pardon my ignorance, but where do you get these benchmark data. Running Windows Native, I see the following (two out of hundreds) in the Powershell window:

[GIN] 2020/12/23 - 13:33:20 |←[97;42m 200 ←[0m|    379.9659ms |       127.0.0.1 |←[97;46m POST    ←[0m /v1/vision/detection
[GIN] 2020/12/23 - 13:39:08 |←[97;42m 200 ←[0m|    574.0846ms |       127.0.0.1 |←[97;46m POST    ←[0m /v1/vision/detection

Are the numbers in my example, 379.9659 and 574.0846 what everyone is looking at? Is there a better place to find a history, perhaps with the data summarized? Thank you.

Yes, those are the numbers we are comparing.
To my knowledge there’s no other data than the output in the console. The numbers seems to be quite consistent when running the same size of images, with only a few odd numbers.

1 Like

Nvidia Jetson Nano 4GB (deepquestai/deepstack:jetpack) -> 210ms avg @ 1920x1080

i edited my post, as i’ve just seen a decrease in response time by 50% or so using a new method. not sure whether it is due to the python script used, or due to using a custom, single class, “yolov5s” ai model, but I figure it may help someone else get the times they require on under-powered hardware.