Performance issue using GPU

Over the time I opened several issues related to performance,
background: using GPU version with RTX2060, response time was ~250-400 msec when using one request per second, sending 2 images per second caused huge delay of first request within the same duration, 2nd image was processed in ~500-900 msec,
that was the timing for face recognition, for object detection it was crashed after few hours.

version 3.6 release and I updated 250-400 was stabilized ~200 seconds, object detection was still an issue.

to provide more details - I ran the docker on AMD 8350FX with 16GB RAM, Ubuntu Server 18.04.

Yesterday I got a new PC, AMD 3700X, MB based on chipset x570, 32GB RAM installed with Ubuntu Server 18.04.03 and experience instability of duration for face recognition, 1 image per second, provides performance of ~170 msec and execution afterward ~450 msec and then gets back to ~170 and again ~450 msec.

that was eliminated the suspect that I had at some point that the motherboard doesn’t support the RTX2060 fast enough.

GPU usage is 3%-4% over the last 24 hours (using Grafana / InfluxData / Telegraf to monitor),

CPU of the docker is ~2.5%, load average is ~0.3.

Environment parameters for docker:
VISION-FACE=True
VISION-DETECTION=True
Mode=High
CUDA_VERSION=9.0.176
CUDA_PKG_VERSION=9-0=9.0.176-1
NVIDIA_REQUIRE_CUDA=cuda>=9.0
NCCL_VERSION=2.4.2
CUDNN_VERSION=7.4.2.24
BATCH_SIZE=32
SLEEP_TIME=0.01

Below are the details from the nvidia-smi and logs:
±----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.00 Driver Version: 418.87.00 CUDA Version: 10.1 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 2060 On | 00000000:07:00.0 Off | N/A |
| 0% 55C P2 39W / 190W | 2615MiB / 5902MiB | 3% Default |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 18781 C python3 901MiB |
| 0 18782 C python3 1703MiB |
±----------------------------------------------------------------------------+

[GIN] 2019/09/05 - 18:18:55 | 200 | 181.798786ms | 192.168.2.3 | POST /v1/vision/face/recognize
[GIN] 2019/09/05 - 18:18:54 | 200 | 176.951144ms | 192.168.2.3 | POST /v1/vision/face/recognize
[GIN] 2019/09/05 - 18:18:54 | 200 | 461.34073ms | 192.168.2.3 | POST /v1/vision/face/recognize
[GIN] 2019/09/05 - 18:18:52 | 200 | 172.024413ms | 192.168.2.3 | POST /v1/vision/face/recognize
[GIN] 2019/09/05 - 18:18:51 | 200 | 175.982045ms | 192.168.2.3 | POST /v1/vision/face/recognize
[GIN] 2019/09/05 - 18:18:50 | 200 | 170.694035ms | 192.168.2.3 | POST /v1/vision/face/recognize
[GIN] 2019/09/05 - 18:18:50 | 200 | 465.042783ms | 192.168.2.3 | POST /v1/vision/face/recognize
[GIN] 2019/09/05 - 18:18:48 | 200 | 201.329837ms | 192.168.2.3 | POST /v1/vision/face/recognize
[GIN] 2019/09/05 - 18:18:47 | 200 | 176.560617ms | 192.168.2.3 | POST /v1/vision/face/recognize
[GIN] 2019/09/05 - 18:18:46 | 200 | 173.570809ms | 192.168.2.3 | POST /v1/vision/face/recognize
[GIN] 2019/09/05 - 18:18:46 | 200 | 458.236456ms | 192.168.2.3 | POST /v1/vision/face/recognize
[GIN] 2019/09/05 - 18:18:44 | 200 | 170.096848ms | 192.168.2.3 | POST /v1/vision/face/recognize
[GIN] 2019/09/05 - 18:18:43 | 200 | 178.558864ms | 192.168.2.3 | POST /v1/vision/face/recognize
[GIN] 2019/09/05 - 18:18:42 | 200 | 171.743272ms | 192.168.2.3 | POST /v1/vision/face/recognize
[GIN] 2019/09/05 - 18:18:42 | 200 | 456.023883ms | 192.168.2.3 | POST /v1/vision/face/recognize
[GIN] 2019/09/05 - 18:18:40 | 200 | 176.622593ms | 192.168.2.3 | POST /v1/vision/face/recognize
[GIN] 2019/09/05 - 18:18:39 | 200 | 183.134114ms | 192.168.2.3 | POST /v1/vision/face/recognize
[GIN] 2019/09/05 - 18:18:38 | 200 | 192.823467ms | 192.168.2.3 | POST /v1/vision/face/recognize
[GIN] 2019/09/05 - 18:18:38 | 200 | 462.44334ms | 192.168.2.3 | POST /v1/vision/face/recognizeC
[GIN] 2019/09/05 - 18:18:36 | 200 | 169.162544ms | 192.168.2.3 | POST /v1/vision/face/recognize
[GIN] 2019/09/05 - 18:18:34 | 200 | 168.404144ms | 192.168.2.3 | POST /v1/vision/face/recognize
[GIN] 2019/09/05 - 18:18:34 | 200 | 461.971529ms | 192.168.2.3 | POST /v1/vision/face/recognize
[GIN] 2019/09/05 - 18:18:24 | 200 | 171.096322ms | 192.168.2.3 | POST /v1/vision/face/recognize

Hello @elad.bar Thanks a lot for your patience and the details your provided. The performance issue is caused by the fact that the GPU and complementary CPU operations are processed within the same thread. We are currently working on improvements to this by separating all CPU and GPU operations to separate threads to enable improved utilization. This should be updated soon.

Thanks @john, looking forward for the new version,

i’ve done some changes in the code running the calls to recognize face, before it called 2 times a second from time to time, now it calls just once in a second and execution takes ~120 msec

The issue now is that I’m getting timeouts every 5-10 minutes, hope the fix will solve it too

1 Like