Skip to content

[Performance]CUDA provider Inference dynamic onnx model is very slow!!! #28305

@meloht

Description

@meloht

Describe the issue

Performance Test PC

Hardware Summary
Windows Windows 11 Pro OS Version 25H2
CPU Intel Core Ultra 9 285k 3.7GHz
RAM DDR5 128GB speed 4400MT/s
GPU NVIDIA RTX 5090 32G CUDA12.9
Storage SSD 2TB

Performance Test Data

Images: 60 images (image size: 1180x92)

PP-OCR Model: ch_PP-OCRv5_det_mobile, ch_PP-OCRv5_rec_mobile, ch_PP-LCNet_x0_25_textline_ori_cls_mobile
PP-OCR export to onnx format Obtaining ONNX Models or download from rapid-ocr Model List.

OnnxRuntime inference tool RapidOCRSharpOnnx

CUDA Inference
Microsoft.ML.OnnxRuntime.Gpu.Windows Version="1.24.4"

Time: 00:01:03.3999016 s

Image

DirectML Inference

Microsoft.ML.OnnxRuntime.DirectML Version="1.24.4"

Time: 00:00:11.1410425 s

Image

CUDA provider Inference dynamic onnx model is very slow!!! but DirectML is normal speed

To reproduce

dotnet add package RapidOCRSharpOnnx
dotnet add package OpenCvSharp4.runtime.win
dotnet add package Microsoft.ML.OnnxRuntime.Gpu.Windows

CUDA Inference

        private static void TestParallelBatch()
        {

            //string detectPath = @"D:\code\RapidOCR-3.8.0\python\rapidocr\models\ch_PP-OCRv4_det_mobile.onnx";
            //string recogPath = @"D:\code\RapidOCR-3.8.0\python\rapidocr\models\ch_PP-OCRv4_rec_mobile.onnx";
            //string clsPath = @"D:\code\RapidOCR-3.8.0\python\rapidocr\models\ch_ppocr_mobile_v2.0_cls_mobile.onnx";

            //string saveDir = null;
            string detectPath = @"C:\deeplearning\gitCode\meloht\RapidOCRSharpOnnx\RapidOCRSharpOnnx.TestCommon\Models\ch_PP-OCRv5_det_mobile.onnx";
            string recogPath = @"C:\deeplearning\gitCode\meloht\RapidOCRSharpOnnx\RapidOCRSharpOnnx.TestCommon\Models\ch_PP-OCRv5_rec_mobile.onnx";
            string clsPath = @"C:\deeplearning\gitCode\meloht\RapidOCRSharpOnnx\RapidOCRSharpOnnx.TestCommon\Models\ch_PP-LCNet_x0_25_textline_ori_cls_mobile.onnx";
            //string saveDir = @"C:\code\model\OCRTestImagesResults";

            using RapidOCRSharp ocr = new RapidOCRSharp(new ExecutionProviderCUDA(new OcrConfig(detectPath, recogPath, LangRec.CH, OCRVersion.PPOCRV5, clsPath), _deviceId));
            var list = Directory.GetFiles(@"C:\FtpFiles\OCRTestImages");
            Stopwatch sw = new Stopwatch();
            sw.Start();
            var resPath = ocr.BatchParallelAsync(list.ToList(), receiveAction: ReceiveResult);
            sw.Stop();
            Console.WriteLine($"BatchAsync Time: {sw.Elapsed} s");


            Console.WriteLine("end");
        }

DirectML Inference

dotnet add package Microsoft.ML.OnnxRuntime.DirectML
        private static void TestParallelBatch()
        {

            //string detectPath = @"D:\code\RapidOCR-3.8.0\python\rapidocr\models\ch_PP-OCRv4_det_mobile.onnx";
            //string recogPath = @"D:\code\RapidOCR-3.8.0\python\rapidocr\models\ch_PP-OCRv4_rec_mobile.onnx";
            //string clsPath = @"D:\code\RapidOCR-3.8.0\python\rapidocr\models\ch_ppocr_mobile_v2.0_cls_mobile.onnx";

            //string saveDir = null;
            string detectPath = @"C:\deeplearning\gitCode\meloht\RapidOCRSharpOnnx\RapidOCRSharpOnnx.TestCommon\Models\ch_PP-OCRv5_det_mobile.onnx";
            string recogPath = @"C:\deeplearning\gitCode\meloht\RapidOCRSharpOnnx\RapidOCRSharpOnnx.TestCommon\Models\ch_PP-OCRv5_rec_mobile.onnx";
            string clsPath = @"C:\deeplearning\gitCode\meloht\RapidOCRSharpOnnx\RapidOCRSharpOnnx.TestCommon\Models\ch_PP-LCNet_x0_25_textline_ori_cls_mobile.onnx";
            //string saveDir = @"C:\code\model\OCRTestImagesResults";

            using RapidOCRSharp ocr = new RapidOCRSharp(new ExecutionProviderDirectML(new OcrConfig(detectPath, recogPath, LangRec.CH, OCRVersion.PPOCRV5, clsPath), _deviceId));
            var list = Directory.GetFiles(@"C:\FtpFiles\OCRTestImages");
            Stopwatch sw = new Stopwatch();
            sw.Start();
            var resPath = ocr.BatchParallelAsync(list.ToList(), receiveAction: ReceiveResult);
            sw.Stop();
            Console.WriteLine($"BatchAsync Time: {sw.Elapsed} s");


            Console.WriteLine("end");
        }

Urgency

Platform

Windows

OS Version

windows 11 Pro

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

Microsoft.ML.OnnxRuntime.Gpu.Windows 1.24.4

ONNX Runtime API

C#

Architecture

X64

Execution Provider

CUDA

Execution Provider Library Version

Microsoft.ML.OnnxRuntime.Gpu.Windows 1.24.4

Model File

Models.zip

Is this a quantized model?

Unknown

Metadata

Metadata

Assignees

No one assigned

    Labels

    .NETPull requests that update .net codeapi:CSharpissues related to the C# APIep:CUDAissues related to the CUDA execution providerep:DMLissues related to the DirectML execution providerperformanceissues related to performance regressions

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions