mirror of
https://github.com/mii443/usls.git
synced 2025-12-03 02:58:22 +00:00
a0d410b46d693d133e4db1d360a3106f22512d46
* Update imageproc crates * Add top-p method for sampling * Add SVTR for text recognition & bug fix
usls
A Rust library integrated with ONNXRuntime, providing a collection of Computer Vison and Vision-Language models including YOLOv8 (Classification, Segmentation, Detection and Pose Detection), YOLOv9, RTDETR, CLIP, DINOv2, FastSAM, YOLO-World, BLIP, PaddleOCR and others. Many execution providers are supported, sunch as CUDA, TensorRT and CoreML.
Supported Models
| Model | Example | CUDA f32 |
CUDA f16 |
TensorRT f32 |
TensorRT f16 |
|---|---|---|---|---|---|
| YOLOv8-detection | demo | ✅ | ✅ | ✅ | ✅ |
| YOLOv8-pose | demo | ✅ | ✅ | ✅ | ✅ |
| YOLOv8-classification | demo | ✅ | ✅ | ✅ | ✅ |
| YOLOv8-segmentation | demo | ✅ | ✅ | ✅ | ✅ |
| YOLOv8-OBB | TODO | TODO | TODO | TODO | TODO |
| YOLOv9 | demo | ✅ | ✅ | ✅ | ✅ |
| RT-DETR | demo | ✅ | ✅ | ✅ | ✅ |
| FastSAM | demo | ✅ | ✅ | ✅ | ✅ |
| YOLO-World | demo | ✅ | ✅ | ✅ | ✅ |
| DINOv2 | demo | ✅ | ✅ | ✅ | ✅ |
| CLIP | demo | ✅ | ✅ | ✅ visual ❌ textual |
✅ visual ❌ textual |
| BLIP | demo | ✅ | ✅ | ✅ visual ❌ textual |
✅ visual ❌ textual |
| DB(Text Detection) | demo | ✅ | ❌ | ✅ | ✅ |
| SVTR(Text Recognition) | demo | ✅ | ❌ | ✅ | ✅ |
Solution Models
Additionally, this repo also provides some solution models such as pedestrian fall detection, head detection, trash detection, and more.
| Model | Example |
|---|---|
| text detection (PPOCR-det v3, v4) 通用文本检测 |
demo |
| text recognition (PPOCR-rec v3, v4) 中英文-文本识别 |
demo |
| face-landmark detection 人脸 & 关键点检测 |
demo |
| head detection 人头检测 |
demo |
| fall detection 摔倒检测 |
demo |
| trash detection 垃圾检测 |
demo |
Demo
cargo run -r --example yolov8 # fastsam, yolov9, blip, clip, dinov2, yolo-world...
Integrate into your own project
1. Install ort
check ort guide
For Linux or MacOS users
- Firstly, download from latest release from ONNXRuntime Releases
- Then linking
export ORT_DYLIB_PATH=/Users/qweasd/Desktop/onnxruntime-osx-arm64-1.17.1/lib/libonnxruntime.1.17.1.dylib
2. Add usls as a dependency to your project's Cargo.toml
cargo add --git https://github.com/jamjamjon/usls
3. Set Options and build model
let options = Options::default()
.with_model("../models/yolov8m-seg-dyn-f16.onnx");
let mut model = YOLO::new(&options)?;
-
If you want to run your model with TensorRT or CoreML
let options = Options::default() .with_trt(0) // using cuda by default // .with_coreml(0) -
If your model has dynamic shapes
let options = Options::default() .with_i00((1, 2, 4).into()) // dynamic batch .with_i02((416, 640, 800).into()) // dynamic height .with_i03((416, 640, 800).into()) // dynamic width -
If you want to set a confidence level for each category
let options = Options::default() .with_confs(&[0.4, 0.15]) // person: 0.4, others: 0.15 -
Go check Options for more model options.
4. Prepare inputs, and then you're ready to go
- Build
DataLoaderto load images
let dl = DataLoader::default()
.with_batch(model.batch.opt as usize)
.load("./assets/")?;
for (xs, _paths) in dl {
let _y = model.run(&xs)?;
}
- Or simply read one image
let x = vec![DataLoader::try_read("./assets/bus.jpg")?];
let y = model.run(&x)?;
5. Annotate and save results
let annotator = Annotator::default().with_saveout("YOLOv8");
annotator.annotate(&x, &y);
Script: converte ONNX model from float32 to float16
import onnx
from pathlib import Path
from onnxconverter_common import float16
model_f32 = "onnx_model.onnx"
model_f16 = float16.convert_float_to_float16(onnx.load(model_f32))
saveout = Path(model_f32).with_name(Path(model_f32).stem + "-f16.onnx")
onnx.save(model_f16, saveout)
Languages
Rust
100%