* Using Rayon to accelarate YOLO post-processing

* Refactor YOLO with outputs format

* Optimize `conf * clss` for yolov5 v6 v7

* Add depth-anything-v2

* Update README.md

* Update CHANGELOG.md
This commit is contained in:
Jamjamjon
2024-07-12 19:46:48 +08:00
committed by GitHub
parent 25d9088e2e
commit edc3a8897c
69 changed files with 1563 additions and 1203 deletions

View File

@ -1,20 +1,46 @@
## v0.0.5 - 2024-07-12
### Changed
- Accelerated `YOLO`'s post-processing using `Rayon`. Now, `YOLOv8-seg` takes only around **~8ms (~20ms in the previous version)**, depending on your machine. Note that this repo's implementation of `YOLOv8-Segment` saves not only the masks but also their contour points. The official `YOLOv8` Python version only saves the masks, making it appear much faster.
- Merged all `YOLOv8-related` solution models into YOLO examples.
- Consolidated all `YOLO-series` model examples into the YOLO example.
- Refactored the `YOLO` struct to unify all `YOLO versions` and `YOLO tasks`. It now supports user-defined YOLO models with different `Preds Tensor Formats`.
- Introduced a new `Nms` trait, combining `apply_bboxes_nms()` and `apply_mbrs_nms()` into `apply_nms()`.
### Added
- Added support for `YOLOv6` and `YOLOv7`.
- Updated documentation for `y.rs`.
- Updated documentation for `bbox.rs`.
- Updated the `README.md`.
- Added `with_yolo_preds()` to `Options`.
- Added support for `Depth-Anything-v2`.
- Added `RTDETR` to the `YOLOVersion` struct.
### Removed
- Merged the following models' examples into the YOLOv8 example: `yolov8-face`, `yolov8-falldown`, `yolov8-head`, `yolov8-trash`, `fastsam`, and `face-parsing`.
- Removed `anchors_first`, `conf_independent`, and their related methods from `Options`.
## v0.0.4 - 2024-06-30
### Added
- Add X struct to handle input and preprocessing
- Add X struct to handle input and preprocessing
- Add Ops struct to manage common operations
- Use SIMD (fast_image_resize) to accelerate model pre-processing and post-processing.YOLOv8-seg post-processing (~120ms => ~20ms), Depth-Anything post-processing (~23ms => ~2ms).
### Deprecated
- Mark `Ops::descale_mask()` as deprecated.
### Fixed
### Fixed
### Changed
### Removed
### Removed
### Refactored

View File

@ -1,6 +1,6 @@
[package]
name = "usls"
version = "0.0.4"
version = "0.0.5"
edition = "2021"
description = "A Rust library integrated with ONNXRuntime, providing a collection of ML models."
repository = "https://github.com/jamjamjon/usls"
@ -11,7 +11,7 @@ exclude = ["assets/*", "examples/*"]
[dependencies]
clap = { version = "4.2.4", features = ["derive"] }
ndarray = { version = "0.15.6" }
ndarray = { version = "0.15.6", features = ["rayon"] }
ort = { version = "2.0.0-rc.2", default-features = false, features = [
"load-dynamic",
"copy-dylibs",

145
README.md
View File

@ -1,61 +1,54 @@
# usls
[![Static Badge](https://img.shields.io/crates/v/usls.svg?style=for-the-badge&logo=rust)](https://crates.io/crates/usls) [![Static Badge](https://img.shields.io/badge/Documents-usls-blue?style=for-the-badge&logo=docs.rs)](https://docs.rs/usls) [![Static Badge](https://img.shields.io/badge/GitHub-black?style=for-the-badge&logo=github)](https://github.com/jamjamjon/usls) ![Static Badge](https://img.shields.io/crates/d/usls?style=for-the-badge)
[![Static Badge](https://img.shields.io/crates/v/usls.svg?style=for-the-badge&logo=rust)](https://crates.io/crates/usls) ![Static Badge](https://img.shields.io/crates/d/usls?style=for-the-badge) [![Static Badge](https://img.shields.io/badge/Documents-usls-blue?style=for-the-badge&logo=docs.rs)](https://docs.rs/usls) [![Static Badge](https://img.shields.io/badge/GitHub-black?style=for-the-badge&logo=github)](https://github.com/jamjamjon/usls)
A Rust library integrated with **ONNXRuntime**, providing a collection of **Computer Vison** and **Vision-Language** models including [YOLOv5](https://github.com/ultralytics/yolov5), [YOLOv8](https://github.com/ultralytics/ultralytics), [YOLOv9](https://github.com/WongKinYiu/yolov9), [YOLOv10](https://github.com/THU-MIG/yolov10), [RTDETR](https://arxiv.org/abs/2304.08069), [CLIP](https://github.com/openai/CLIP), [DINOv2](https://github.com/facebookresearch/dinov2), [FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM), [YOLO-World](https://github.com/AILab-CVC/YOLO-World), [BLIP](https://arxiv.org/abs/2201.12086), [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR), [Depth-Anything](https://github.com/LiheYoung/Depth-Anything), [MODNet](https://github.com/ZHKKKe/MODNet) and others.
| Monocular Depth Estimation |
| :--------------------------------------------------------------: |
| <img src='examples/depth-anything/demo.png' width="800px"> |
| Depth-Anything |
| :----------------------------: |
|<img src='examples/depth-anything/demo.png' width="800px">|
| YOLOP-v2 | Text-Detection |
| :----------------------------: | :------------------------------: |
|<img src='examples/yolop/demo.png' width="385px">| <img src='examples/db/demo.png' width="385x"> |
| Portrait Matting |
| :----------------------------: |
|<img src='examples/modnet/demo.png' width="800px">|
| YOLOv8-Obb |
| :----------------------------: |
|<img src='examples/yolov8/demo-obb-2.png' width="800px">|
| Panoptic Driving Perception | Text-Detection-Recognition |
| :----------------------------------------------------: | :------------------------------------------------: |
| <img src='examples/yolop/demo.png' width="385px"> | <img src='examples/db/demo.png' width="385x"> |
| Portrait Matting |
| :------------------------------------------------------: |
| <img src='examples/modnet/demo.png' width="800px"> |
## Supported Models
| Model | Task / Type | Example | CUDA<br />f32 | CUDA<br />f16 | TensorRT<br />f32 | TensorRT<br />f16 |
| :---------------------------------------------------------------: | :-------------------------: | :----------------------: | :-----------: | :-----------: | :------------------------: | :-----------------------: |
| [YOLOv5](https://github.com/ultralytics/yolov5) | Classification<br />Object Detection<br />Instance Segmentation | [demo](examples/yolov5) | ✅ | ✅ | ✅ | ✅ |
| [YOLOv8](https://github.com/ultralytics/ultralytics) | Object Detection<br />Instance Segmentation<br />Classification<br />Oriented Object Detection<br />Keypoint Detection | [demo](examples/yolov8) | ✅ | ✅ | ✅ | ✅ |
| [YOLOv9](https://github.com/WongKinYiu/yolov9) | Object Detection | [demo](examples/yolov9) | ✅ | ✅ | ✅ | ✅ |
| [YOLOv10](https://github.com/THU-MIG/yolov10) | Object Detection | [demo](examples/yolov10) | ✅ | ✅ | ✅ | ✅ |
| [RTDETR](https://arxiv.org/abs/2304.08069) | Object Detection | [demo](examples/rtdetr) | ✅ | ✅ | ✅ | ✅ |
| [FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM) | Instance Segmentation | [demo](examples/fastsam) | ✅ | ✅ | ✅ | ✅ |
| [YOLO-World](https://github.com/AILab-CVC/YOLO-World) | Object Detection | [demo](examples/yolo-world) | ✅ | ✅ | ✅ | ✅ |
| [DINOv2](https://github.com/facebookresearch/dinov2) | Vision-Self-Supervised | [demo](examples/dinov2) | ✅ | ✅ | ✅ | ✅ |
| [CLIP](https://github.com/openai/CLIP) | Vision-Language | [demo](examples/clip) | ✅ | ✅ | ✅ visual<br />❌ textual | ✅ visual<br />❌ textual |
| [BLIP](https://github.com/salesforce/BLIP) | Vision-Language | [demo](examples/blip) | ✅ | ✅ |  visual<br />❌ textual | ✅ visual<br />❌ textual |
| [DB](https://arxiv.org/abs/1911.08947) | Text Detection | [demo](examples/db) | | | | ✅ |
| [SVTR](https://arxiv.org/abs/2205.00159) | Text Recognition | [demo](examples/svtr) | | ✅ | | ✅ |
| [RTMO](https://github.com/open-mmlab/mmpose/tree/main/projects/rtmo) | Keypoint Detection | [demo](examples/rtmo) | ✅ | ✅ | | |
| [YOLOPv2](https://arxiv.org/abs/2208.11434) | Panoptic Driving Perception | [demo](examples/yolop) | ✅ | ✅ | ✅ | ✅ |
| [Depth-Anything](https://github.com/LiheYoung/Depth-Anything) | Monocular Depth Estimation | [demo](examples/depth-anything) | ✅ | ✅ | ❌ | ❌ |
| [MODNet](https://github.com/ZHKKKe/MODNet) | Image Matting | [demo](examples/modnet) | ✅ | ✅ | ✅ | ✅ |
| Model | Task / Type | Example | CUDA<br />f32 | CUDA<br />f16 | TensorRT<br />f32 | TensorRT<br />f16 |
| :---------------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------------------: | :--------------------------: | :-----------: | :-----------: | :------------------------: | :-----------------------: |
| [YOLOv5](https://github.com/ultralytics/yolov5) | Classification<br />Object Detection<br />Instance Segmentation | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ |
| [YOLOv6](https://github.com/meituan/YOLOv6) | Object Detection | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ |
| [YOLOv7](https://github.com/WongKinYiu/yolov7) | Object Detection | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ |
| [YOLOv8](https://github.com/ultralytics/ultralytics) | Object Detection<br />Instance Segmentation<br />Classification<br />Oriented Object Detection<br />Keypoint Detection | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ |
| [YOLOv9](https://github.com/WongKinYiu/yolov9) | Object Detection | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ |
| [YOLOv10](https://github.com/THU-MIG/yolov10) | Object Detection | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ |
| [RTDETR](https://arxiv.org/abs/2304.08069) | Object Detection | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ |
| [FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM) | Instance Segmentation | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ |
| [YOLO-World](https://github.com/AILab-CVC/YOLO-World) | Object Detection | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ |
| [DINOv2](https://github.com/facebookresearch/dinov2) | Vision-Self-Supervised | [demo](examples/dinov2) | ✅ | ✅ | ✅ | ✅ |
| [CLIP](https://github.com/openai/CLIP) | Vision-Language | [demo](examples/clip) | ✅ || ✅ visual<br />❌ textual | ✅ visual<br />❌ textual |
| [BLIP](https://github.com/salesforce/BLIP) | Vision-Language | [demo](examples/blip) | ✅ || ✅ visual<br />❌ textual | ✅ visual<br />❌ textual |
| [DB](https://arxiv.org/abs/1911.08947) | Text Detection | [demo](examples/db) | ✅ | ✅ | | |
| [SVTR](https://arxiv.org/abs/2205.00159) | Text Recognition | [demo](examples/svtr) | ✅ | ✅ | ✅ | ✅ |
| [RTMO](https://github.com/open-mmlab/mmpose/tree/main/projects/rtmo) | Keypoint Detection | [demo](examples/rtmo) | ✅ | ✅ | ❌ | ❌ |
| [YOLOPv2](https://arxiv.org/abs/2208.11434) | Panoptic Driving Perception | [demo](examples/yolop) | ✅ | ✅ | ✅ | ✅ |
| [Depth-Anything<br />(v1, v2)](https://github.com/LiheYoung/Depth-Anything) | Monocular Depth Estimation | [demo](examples/depth-anything) | ✅ | ✅ | ❌ | ❌ |
| [MODNet](https://github.com/ZHKKKe/MODNet) | Image Matting | [demo](examples/modnet) | ✅ | ✅ | ✅ | ✅ |
## Installation
Refer to **[ort guide](https://ort.pyke.io/setup/linking)**
Refer to [ort docs](https://ort.pyke.io/setup/linking)
<details close>
<summary>For Linux or MacOS users</summary>
- Firstly, download from latest release from [ONNXRuntime Releases](https://github.com/microsoft/onnxruntime/releases)
- Download from [ONNXRuntime Releases](https://github.com/microsoft/onnxruntime/releases)
- Then linking
```Shell
export ORT_DYLIB_PATH=/Users/qweasd/Desktop/onnxruntime-osx-arm64-1.17.1/lib/libonnxruntime.1.17.1.dylib
@ -63,18 +56,13 @@ Refer to **[ort guide](https://ort.pyke.io/setup/linking)**
</details>
## Demo
## Quick Start
```Shell
cargo run -r --example yolov8 # yolov10, blip, clip, yolop, svtr, db, yolo-world, ...
cargo run -r --example yolo # blip, clip, yolop, svtr, db, ...
```
## Integrate into your own project
<details close>
<summary>Expand</summary>
### 1. Add `usls` as a dependency to your project's `Cargo.toml`
@ -83,17 +71,18 @@ cargo add usls
```
Or you can use specific commit
```Shell
usls = { git = "https://github.com/jamjamjon/usls", rev = "???sha???"}
```
### 2. Set `Options` and build model
### 2. Build model
```Rust
let options = Options::default()
.with_model("../models/yolov8m-seg-dyn-f16.onnx");
.with_yolo_version(YOLOVersion::V5) // YOLOVersion: V5, V6, V7, V8, V9, V10, RTDETR
.with_yolo_task(YOLOTask::Classify) // YOLOTask: Classify, Detect, Pose, Segment, Obb
.with_model("xxxx.onnx")?;
let mut model = YOLO::new(options)?;
```
@ -112,15 +101,15 @@ let mut model = YOLO::new(options)?;
.with_i02((416, 640, 800).into()) // dynamic height
.with_i03((416, 640, 800).into()) // dynamic width
```
- If you want to set a confidence level for each category
- If you want to set a confidence for each category
```Rust
let options = Options::default()
.with_confs(&[0.4, 0.15]) // class 0: 0.4, others: 0.15
.with_confs(&[0.4, 0.15]) // class_0: 0.4, others: 0.15
```
- Go check [Options](src/options.rs) for more model options.
- Go check [Options](src/core/options.rs) for more model options.
#### 3. Prepare inputs, and then you're ready to go
#### 3. Load images
- Build `DataLoader` to load images
@ -141,31 +130,19 @@ let x = vec![DataLoader::try_read("./assets/bus.jpg")?];
let y = model.run(&x)?;
```
#### 4. Annotate and save results
#### 4. Annotate and save
```Rust
let annotator = Annotator::default().with_saveout("YOLOv8");
let annotator = Annotator::default().with_saveout("YOLO");
annotator.annotate(&x, &y);
```
#### 5. Get results
The inference outputs of provided models will be saved to `Vec<Y>`.
```Rust
pub struct Y {
probs: Option<Prob>,
bboxes: Option<Vec<Bbox>>,
keypoints: Option<Vec<Vec<Keypoint>>>,
mbrs: Option<Vec<Mbr>>,
polygons: Option<Vec<Polygon>>,
texts: Option<Vec<String>>,
masks: Option<Vec<Mask>>,
embedding: Option<Embedding>,
}
```
The inference outputs of provided models will be saved to `Vec<Y>`.
- You can get detection bboxes with `y.bboxes()`:
```Rust
let ys = model.run(&xs)?;
for y in ys {
@ -185,29 +162,5 @@ pub struct Y {
}
}
```
More `Bbox` methods here: `src/ys/bbox.rs`
- Other tasks results can be found at: `src/ys/y.rs`
</details>
## Solution Models
<details close>
<summary>Additionally, this repo also provides some solution models.</summary>
| Model | Example | Result |
| :---------------------------------------------------------------------------------------------------------: | :------------------------------: | :-----------------------------------------------------------------------------: |
| Lane Line Segmentation<br /> Drivable Area Segmentation<br />Car Detection<br />车道线-可行驶区域-车辆检测 | [demo](examples/yolov8-plastic-bag) | <img src='examples/yolop/demo.png' width="220px" height="140px"> |
| Face Parsing<br /> 人脸解析 | [demo](examples/face-parsing) | <img src='examples/face-parsing/demo.png' width="220px" height="200px"> |
| Text Detection<br />(PPOCR-det v3, v4)<br />通用文本检测 | [demo](examples/db) | <img src='examples/db/demo.png' width="250px" height="200px"> |
| Text Recognition<br />(PPOCR-rec v3, v4)<br />中英文-文本识别 | [demo](examples/svtr) | |
| Face-Landmark Detection<br />人脸 & 关键点检测 | [demo](examples/yolov8-face) | <img src='examples/yolov8-face/demo.png' width="220px" height="180px"> |
| Head Detection<br /> 人头检测 | [demo](examples/yolov8-head) | <img src='examples/yolov8-head/demo.png' width="220px" height="180px"> |
| Fall Detection<br /> 摔倒检测 | [demo](examples/yolov8-falldown) | <img src='examples/yolov8-falldown/demo.png' width="220px" height="180px"> |
| Trash Detection<br /> 垃圾检测 | [demo](examples/yolov8-plastic-bag) | <img src='examples/yolov8-trash/demo.png' width="250px" height="180px"> |
</details>
- Other: [Docs](https://docs.rs/usls/latest/usls/struct.Y.html)

Binary file not shown.

Before

Width:  |  Height:  |  Size: 68 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 272 KiB

View File

@ -9,6 +9,7 @@ cargo run -r --example depth-anything
- [depth-anything-s-dyn](https://github.com/jamjamjon/assets/releases/download/v0.0.1/depth-anything-s-dyn.onnx)
- [depth-anything-b-dyn](https://github.com/jamjamjon/assets/releases/download/v0.0.1/depth-anything-b-dyn.onnx)
- [depth-anything-l-dyn](https://github.com/jamjamjon/assets/releases/download/v0.0.1/depth-anything-l-dyn.onnx)
- [depth-anything-v2-s](https://github.com/jamjamjon/assets/releases/download/v0.0.1/depth-anything-v2-s.onnx)
## Results

View File

@ -1,9 +1,10 @@
use usls::{models::DepthAnything, Annotator, DataLoader, Options};
fn main() -> Result<(), Box<dyn std::error::Error>> {
// visual
// options
let options = Options::default()
.with_model("depth-anything-s-dyn.onnx")?
// .with_model("depth-anything-s-dyn.onnx")?
.with_model("depth-anything-v2-s.onnx")?
.with_i00((1, 1, 8).into())
.with_i02((384, 512, 1024).into())
.with_i03((384, 512, 1024).into());

View File

@ -1,93 +0,0 @@
Using `YOLOv8-seg` model trained on `CelebAMask-HQ` for face-parsing.
## Quick Start
```shell
cargo run -r --example face-parsing
```
## Pretrained Model
- [face-parsing-dyn](https://github.com/jamjamjon/assets/releases/download/v0.0.1/face-parsing-dyn.onnx)
- [face-parsing-dyn-f16](https://github.com/jamjamjon/assets/releases/download/v0.0.1/face-parsing-dyn-f16.onnx)
## Datasets
- [CelebAMask-HQ](https://github.com/switchablenorms/CelebAMask-HQ/tree/master/face_parsing)
## YOLO Labels
- [Download Processed YOLO labels](https://github.com/jamjamjon/assets/releases/download/v0.0.1/CelebAMask-HQ-YOLO-Labels.zip)
- Or you can run Python script
```Python
import cv2
import numpy as np
from pathlib import Path
from tqdm import tqdm
mapping = {
'background': 0,
'skin': 1,
'nose': 2,
'eye_g': 3,
'l_eye': 4,
'r_eye': 5,
'l_brow': 6,
'r_brow': 7,
'l_ear': 8,
'r_ear': 9,
'mouth': 10,
'u_lip': 11,
'l_lip': 12,
'hair': 13,
'hat': 14,
'ear_r': 15,
'neck_l': 16,
'neck': 17,
'cloth': 18
}
def main():
saveout_dir = Path("labels")
if not saveout_dir.exists():
saveout_dir.mkdir()
else:
import shutil
shutil.rmtree(saveout_dir)
saveout_dir.mkdir()
image_list = [x for x in Path("CelebAMask-HQ-mask-anno/").rglob("*.png")]
for image_path in tqdm(image_list, total=len(image_list)):
image_gray = cv2.imread(str(image_path), cv2.IMREAD_GRAYSCALE)
stem = image_path.stem
name, cls_ = stem.split("_", 1)
segments = cv2.findContours(image_gray, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[0]
saveout = saveout_dir / f"{int(name)}.txt"
with open(saveout, 'a+') as f:
for segment in segments:
line = f"{mapping[cls_]}"
segment = segment / 512
for seg in segment:
xn, yn = seg[0]
line += f" {xn} {yn}"
f.write(line + "\n")
if __name__ == "__main__":
main()
```
## Results
![](./demo.png)

View File

@ -1,32 +0,0 @@
use usls::{models::YOLO, Annotator, DataLoader, Options, Vision};
fn main() -> Result<(), Box<dyn std::error::Error>> {
// build model
let options = Options::default()
.with_model("face-parsing-dyn.onnx")?
.with_i00((1, 1, 4).into())
.with_i02((416, 640, 800).into())
.with_i03((416, 640, 800).into())
// .with_trt(0)
// .with_fp16(true)
.with_confs(&[0.5]);
let mut model = YOLO::new(options)?;
// load image
let x = vec![DataLoader::try_read("./assets/nini.png")?];
// run
let y = model.run(&x)?;
// annotate
let annotator = Annotator::default()
.without_bboxes(true)
.without_bboxes_conf(true)
.without_bboxes_name(true)
.without_contours(false)
.with_polygons_name(false)
.with_saveout("Face-Parsing");
annotator.annotate(&x, &y);
Ok(())
}

View File

@ -1,23 +0,0 @@
## Quick Start
```shell
cargo run -r --example fastsam
```
## Donwload or export ONNX Model
- **Export**
```bash
pip install -U ultralytics
yolo export model=FastSAM-s.pt format=onnx simplify dynamic
```
- **Download**
[FastSAM-s-dyn-f16](https://github.com/jamjamjon/assets/releases/download/v0.0.1/FastSAM-s-dyn-f16.onnx)
## Results
![](./demo.png)

View File

@ -1,24 +0,0 @@
use usls::{models::YOLO, Annotator, DataLoader, Options, Vision};
fn main() -> Result<(), Box<dyn std::error::Error>> {
// build model
let options = Options::default()
.with_model("FastSAM-s-dyn-f16.onnx")?
.with_i00((1, 1, 4).into())
.with_i02((416, 640, 800).into())
.with_i03((416, 640, 800).into())
.with_confs(&[0.4]);
let mut model = YOLO::new(options)?;
// load image
let x = vec![DataLoader::try_read("./assets/bus.jpg")?];
// run
let y = model.run(&x)?;
// annotate
let annotator = Annotator::default().with_saveout("FastSAM");
annotator.annotate(&x, &y);
Ok(())
}

View File

@ -1,21 +0,0 @@
## Quick Start
```shell
cargo run -r --example rtdetr
```
## Donwload or export ONNX Model
- Export
```bash
pip install -U ultralytics
yolo export model=rtdetr-l.pt format=onnx simplify dynamic opset=16
```
- Download
[rtdetr-l-f16 model](https://github.com/jamjamjon/assets/releases/download/v0.0.1/rtdetr-l-f16.onnx)
## Results
![](./demo.png)

Binary file not shown.

Before

Width:  |  Height:  |  Size: 439 KiB

View File

@ -1,22 +0,0 @@
use usls::{coco, models::RTDETR, Annotator, DataLoader, Options};
fn main() -> Result<(), Box<dyn std::error::Error>> {
// build model
let options = Options::default()
.with_model("rtdetr-l-f16.onnx")?
.with_confs(&[0.4, 0.15])
.with_names(&coco::NAMES_80);
let mut model = RTDETR::new(options)?;
// load image
let x = vec![DataLoader::try_read("./assets/bus.jpg")?];
// run
let y = model.run(&x)?;
// annotate
let annotator = Annotator::default().with_saveout("RT-DETR");
annotator.annotate(&x, &y);
Ok(())
}

View File

@ -1,43 +0,0 @@
## Quick Start
```shell
cargo run -r --example yolo-world
```
## Donwload or Export ONNX Model
- **Download**
[yolov8s-world-v2-shoes](https://github.com/jamjamjon/assets/releases/download/v0.0.1/yolov8s-world-v2-shoes.onnx)
- **Or generate your own `yolo-world` model and then Export**
- **Installation**
```shell
pip install -U ultralytics
```
- **Generate**
```python
from ultralytics import YOLO
# Initialize a YOLO-World model
model = YOLO('yolov8m-worldv2.pt')
# Define custom classes
model.set_classes(["shoes"])
# Save the model with the defined offline vocabulary
model.save("custom_yolov8m-world-v2.pt")
```
- **Export**
```shell
yolo export model=custom_yolov8m-world-v2.pt format=onnx simplify dynamic
```
## Results
![](./demo.png)

View File

@ -1,25 +0,0 @@
use usls::{models::YOLO, Annotator, DataLoader, Options, Vision};
fn main() -> Result<(), Box<dyn std::error::Error>> {
// build model
let options = Options::default()
.with_model("yolov8s-world-v2-shoes.onnx")?
.with_i00((1, 1, 4).into())
.with_i02((416, 640, 800).into())
.with_i03((416, 640, 800).into())
.with_confs(&[0.3])
.with_profile(false);
let mut model = YOLO::new(options)?;
// load image
let x = vec![DataLoader::try_read("./assets/bus.jpg")?];
// run
let y = model.run(&x)?;
// annotate
let annotator = Annotator::default().with_saveout("YOLO-World");
annotator.annotate(&x, &y);
Ok(())
}

181
examples/yolo/README.md Normal file
View File

@ -0,0 +1,181 @@
<h1 align='center'>YOLO-Series</h1>
| Detection | Instance Segmentation | Pose |
| :---------------: | :------------------------: |:---------------: |
| <img src='./demos/det.png' width="300px"> | <img src='./demos/seg.png' width="300px"> |<img src='./demos/pose.png' width="300px"> |
| Classification | Obb |
| :------------------------: |:------------------------: |
|<img src='./demos/cls.png' width="300px"> |<img src='./demos/obb-2.png' width="628px">
| Head Detection | Fall Detection | Trash Detection |
| :------------------------: |:------------------------: |:------------------------: |
|<img src='./demos/head.png' width="300px"> |<img src='./demos/falldown.png' width="300px">|<img src='./demos/trash.png' width="300px">
| YOLO-World | Face Parsing | FastSAM |
| :------------------------: |:------------------------: |:------------------------: |
|<img src='./demos/yolov8-world.png' width="300px"> |<img src='./demos/face-parsing.png' width="300px">|<img src='./demos/fastsam.png' width="300px">
## Quick Start
```Shell
# Classify
cargo run -r --example yolo -- --task classify --version v5 # YOLOv5
cargo run -r --example yolo -- --task classify --version v8 # YOLOv8
# Detect
cargo run -r --example yolo -- --task detect --version v5 # YOLOv5
cargo run -r --example yolo -- --task detect --version v6 # YOLOv6
cargo run -r --example yolo -- --task detect --version v7 # YOLOv7
cargo run -r --example yolo -- --task detect --version v8 # YOLOv8
cargo run -r --example yolo -- --task detect --version v9 # YOLOv9
cargo run -r --example yolo -- --task detect --version v10 # YOLOv10
cargo run -r --example yolo -- --task detect --version rtdetr # YOLOv8-RTDETR
cargo run -r --example yolo -- --task detect --version v8 --model yolov8s-world-v2-shoes.onnx # YOLOv8-world
# Pose
cargo run -r --example yolo -- --task pose --version v8 # YOLOv8-Pose
# Segment
cargo run -r --example yolo -- --task segment --version v5 # YOLOv5-Segment
cargo run -r --example yolo -- --task segment --version v8 # YOLOv8-Segment
cargo run -r --example yolo -- --task segment --version v8 --model FastSAM-s-dyn-f16.onnx # FastSAM
# Obb
cargo run -r --example yolo -- --task obb --version v8 # YOLOv8-Obb
```
<details close>
<summary>other options</summary>
`--source` to specify the input images
`--model` to specify the ONNX model
`--width --height` to specify the input resolution
`--nc` to specify the number of model's classes
`--plot` to annotate with inference results
`--profile` to profile
`--cuda --trt --coreml --device_id` to select device
`--half` to use float16 when using TensorRT EP
</details>
## YOLOs configs with `Options`
<details open>
<summary>Use official YOLO Models</summary>
```Rust
let options = Options::default()
.with_yolo_version(YOLOVersion::V5) // YOLOVersion: V5, V6, V7, V8, V9, V10, RTDETR
.with_yolo_task(YOLOTask::Classify) // YOLOTask: Classify, Detect, Pose, Segment, Obb
.with_model("xxxx.onnx")?;
```
</details>
<details open>
<summary>Cutomized your own YOLO model</summary>
```Rust
// This config is for YOLOv8-Segment
use usls::{AnchorsPosition, BoxType, ClssType, YOLOPreds};
let options = Options::default()
.with_yolo_preds(
YOLOPreds {
bbox: Some(BoxType::Cxcywh),
clss: ClssType::Clss,
coefs: Some(true),
anchors: Some(AnchorsPosition::After),
..Default::default()
}
)
.with_model("xxxx.onnx")?;
```
</details>
## Other YOLOv8 Solution Models
| Model | Weights | Datasets|
|:---------------------: | :--------------------------: | :-------------------------------: |
| Face-Landmark Detection | [yolov8-face-dyn-f16](https://github.com/jamjamjon/assets/releases/download/v0.0.1/yolov8-face-dyn-f16.onnx) | |
| Head Detection | [yolov8-head-f16](https://github.com/jamjamjon/assets/releases/download/v0.0.1/yolov8-head-f16.onnx) | |
| Fall Detection | [yolov8-falldown-f16](https://github.com/jamjamjon/assets/releases/download/v0.0.1/yolov8-falldown-f16.onnx) | |
| Trash Detection | [yolov8-plastic-bag-f16](https://github.com/jamjamjon/assets/releases/download/v0.0.1/yolov8-plastic-bag-f16.onnx) | |
| FaceParsing | [face-parsing-dyn](https://github.com/jamjamjon/assets/releases/download/v0.0.1/face-parsing-dyn.onnx) | [CelebAMask-HQ](https://github.com/switchablenorms/CelebAMask-HQ/tree/master/face_parsing)<br />[[Processed YOLO labels]](https://github.com/jamjamjon/assets/releases/download/v0.0.1/CelebAMask-HQ-YOLO-Labels.zip)[[Python Script]](https://github.com/jamjamjon/assets/releases/download/v0.0.1/CelebAMask-HQ-YOLO-Labels.zip) |
## Export ONNX Models
<details close>
<summary>YOLOv5</summary>
[Here](https://docs.ultralytics.com/yolov5/tutorials/model_export/)
</details>
<details close>
<summary>YOLOv6</summary>
[Here](https://github.com/meituan/YOLOv6/tree/main/deploy/ONNX)
</details>
<details close>
<summary>YOLOv7</summary>
[Here](https://github.com/WongKinYiu/yolov7?tab=readme-ov-file#export)
</details>
<details close>
<summary>YOLOv8</summary>
```Shell
pip install -U ultralytics
# export onnx model with dynamic shapes
yolo export model=yolov8m.pt format=onnx simplify dynamic
yolo export model=yolov8m-cls.pt format=onnx simplify dynamic
yolo export model=yolov8m-pose.pt format=onnx simplify dynamic
yolo export model=yolov8m-seg.pt format=onnx simplify dynamic
yolo export model=yolov8m-obb.pt format=onnx simplify dynamic
# export onnx model with fixed shapes
yolo export model=yolov8m.pt format=onnx simplify
yolo export model=yolov8m-cls.pt format=onnx simplify
yolo export model=yolov8m-pose.pt format=onnx simplify
yolo export model=yolov8m-seg.pt format=onnx simplify
yolo export model=yolov8m-obb.pt format=onnx simplify
```
</details>
<details close>
<summary>YOLOv9</summary>
[Here](https://github.com/WongKinYiu/yolov9/blob/main/export.py)
</details>
<details close>
<summary>YOLOv10</summary>
[Here](https://github.com/THU-MIG/yolov10#export)
</details>

View File

Before

Width:  |  Height:  |  Size: 453 KiB

After

Width:  |  Height:  |  Size: 453 KiB

View File

Before

Width:  |  Height:  |  Size: 451 KiB

After

Width:  |  Height:  |  Size: 451 KiB

View File

Before

Width:  |  Height:  |  Size: 105 KiB

After

Width:  |  Height:  |  Size: 105 KiB

View File

Before

Width:  |  Height:  |  Size: 286 KiB

After

Width:  |  Height:  |  Size: 286 KiB

View File

Before

Width:  |  Height:  |  Size: 57 KiB

After

Width:  |  Height:  |  Size: 57 KiB

View File

Before

Width:  |  Height:  |  Size: 321 KiB

After

Width:  |  Height:  |  Size: 321 KiB

View File

Before

Width:  |  Height:  |  Size: 291 KiB

After

Width:  |  Height:  |  Size: 291 KiB

View File

Before

Width:  |  Height:  |  Size: 546 KiB

After

Width:  |  Height:  |  Size: 546 KiB

View File

Before

Width:  |  Height:  |  Size: 552 KiB

After

Width:  |  Height:  |  Size: 552 KiB

View File

Before

Width:  |  Height:  |  Size: 457 KiB

After

Width:  |  Height:  |  Size: 457 KiB

View File

Before

Width:  |  Height:  |  Size: 391 KiB

After

Width:  |  Height:  |  Size: 391 KiB

View File

Before

Width:  |  Height:  |  Size: 367 KiB

After

Width:  |  Height:  |  Size: 367 KiB

View File

Before

Width:  |  Height:  |  Size: 453 KiB

After

Width:  |  Height:  |  Size: 453 KiB

180
examples/yolo/main.rs Normal file
View File

@ -0,0 +1,180 @@
use anyhow::Result;
use clap::Parser;
use usls::{coco, models::YOLO, Annotator, DataLoader, Options, Vision, YOLOTask, YOLOVersion};
#[derive(Parser, Clone)]
#[command(author, version, about, long_about = None)]
pub struct Args {
#[arg(long)]
pub model: Option<String>,
#[arg(long, default_value_t = String::from("./assets/bus.jpg"))]
pub source: String,
#[arg(long, value_enum, default_value_t = YOLOTask::Detect)]
pub task: YOLOTask,
#[arg(long, value_enum, default_value_t = YOLOVersion::V8)]
pub version: YOLOVersion,
#[arg(long, default_value_t = 224)]
pub width_min: isize,
#[arg(long, default_value_t = 640)]
pub width: isize,
#[arg(long, default_value_t = 800)]
pub width_max: isize,
#[arg(long, default_value_t = 224)]
pub height_min: isize,
#[arg(long, default_value_t = 640)]
pub height: isize,
#[arg(long, default_value_t = 800)]
pub height_max: isize,
#[arg(long, default_value_t = 80)]
pub nc: usize,
#[arg(long)]
pub trt: bool,
#[arg(long)]
pub cuda: bool,
#[arg(long)]
pub half: bool,
#[arg(long)]
pub coreml: bool,
#[arg(long, default_value_t = 0)]
pub device_id: usize,
#[arg(long)]
pub profile: bool,
#[arg(long)]
pub no_plot: bool,
}
fn main() -> Result<()> {
let args = Args::parse();
// build options
let options = Options::default();
// version & task
let options =
match args.version {
YOLOVersion::V5 => {
match args.task {
YOLOTask::Classify => options
.with_model(&args.model.unwrap_or("yolov5n-cls-dyn.onnx".to_string()))?,
YOLOTask::Detect => {
options.with_model(&args.model.unwrap_or("yolov5n-dyn.onnx".to_string()))?
}
YOLOTask::Segment => options
.with_model(&args.model.unwrap_or("yolov5n-seg-dyn.onnx".to_string()))?,
t => anyhow::bail!("Task: {t:?} is unsupported for {:?}", args.version),
}
}
YOLOVersion::V6 => match args.task {
YOLOTask::Detect => options
.with_model(&args.model.unwrap_or("yolov6n-dyn.onnx".to_string()))?
.with_nc(args.nc),
t => anyhow::bail!("Task: {t:?} is unsupported for {:?}", args.version),
},
YOLOVersion::V7 => match args.task {
YOLOTask::Detect => options
.with_model(&args.model.unwrap_or("yolov7-tiny-dyn.onnx".to_string()))?
.with_nc(args.nc),
t => anyhow::bail!("Task: {t:?} is unsupported for {:?}", args.version),
},
YOLOVersion::V8 => {
match args.task {
YOLOTask::Classify => options
.with_model(&args.model.unwrap_or("yolov8m-cls-dyn.onnx".to_string()))?,
YOLOTask::Detect => {
options.with_model(&args.model.unwrap_or("yolov8m-dyn.onnx".to_string()))?
}
YOLOTask::Segment => options
.with_model(&args.model.unwrap_or("yolov8m-seg-dyn.onnx".to_string()))?,
YOLOTask::Pose => options
.with_model(&args.model.unwrap_or("yolov8m-pose-dyn.onnx".to_string()))?,
YOLOTask::Obb => options
.with_model(&args.model.unwrap_or("yolov8m-obb-dyn.onnx".to_string()))?,
}
}
YOLOVersion::V9 => match args.task {
YOLOTask::Detect => options
.with_model(&args.model.unwrap_or("yolov9-c-dyn-f16.onnx".to_string()))?,
t => anyhow::bail!("Task: {t:?} is unsupported for {:?}", args.version),
},
YOLOVersion::V10 => match args.task {
YOLOTask::Detect => {
options.with_model(&args.model.unwrap_or("yolov10n-dyn.onnx".to_string()))?
}
t => anyhow::bail!("Task: {t:?} is unsupported for {:?}", args.version),
},
YOLOVersion::RTDETR => match args.task {
YOLOTask::Detect => {
options.with_model(&args.model.unwrap_or("rtdetr-l-f16.onnx".to_string()))?
}
t => anyhow::bail!("Task: {t:?} is unsupported for {:?}", args.version),
},
}
.with_yolo_version(args.version)
.with_yolo_task(args.task);
// device
let options = if args.cuda {
options.with_cuda(args.device_id)
} else if args.trt {
let options = options.with_trt(args.device_id);
if args.half {
options.with_fp16(true)
} else {
options
}
} else if args.coreml {
options.with_coreml(args.device_id)
} else {
options.with_cpu()
};
let options = options
.with_i00((1, 1, 4).into())
.with_i02((args.height_min, args.height, args.height_max).into())
.with_i03((args.width_min, args.width, args.width_max).into())
.with_confs(&[0.2, 0.15]) // class_0: 0.4, others: 0.15
// .with_names(&coco::NAMES_80)
.with_names2(&coco::KEYPOINTS_NAMES_17)
.with_profile(args.profile);
let mut model = YOLO::new(options)?;
// build dataloader
let dl = DataLoader::default()
.with_batch(model.batch() as _)
.load(args.source)?;
// build annotator
let annotator = Annotator::default()
.with_skeletons(&coco::SKELETONS_16)
.with_bboxes_thickness(7)
.without_masks(true) // No masks plotting.
.with_saveout("YOLO-Series");
// run & annotate
for (xs, _paths) in dl {
// let ys = model.run(&xs)?; // way one
let ys = model.forward(&xs, args.profile)?; // way two
if !args.no_plot {
annotator.annotate(&xs, &ys);
}
}
Ok(())
}

View File

@ -1,26 +0,0 @@
## Quick Start
```shell
cargo run -r --example yolov10
```
## Export ONNX Model
- **Export**
```shell
# clone repo and install dependencies
git clone https://github.com/THU-MIG/yolov10.git
cd yolov10
pip install -r requirements.txt
# donwload `pt` weights
wget https://github.com/THU-MIG/yolov10/releases/download/v1.1/yolov10n.pt
# export ONNX model
yolo export model=yolov10n.pt format=onnx opset=13 simplify dynamic
```
## Results
![](./demo.png)

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.8 MiB

View File

@ -1,28 +0,0 @@
use usls::{
models::{YOLOVersion, YOLO},
Annotator, DataLoader, Options, Vision,
};
fn main() -> Result<(), Box<dyn std::error::Error>> {
// build model
let options = Options::default()
.with_model("yolov10n-dyn.onnx")?
.with_yolo_version(YOLOVersion::V10)
.with_i00((1, 1, 4).into())
.with_i02((416, 640, 800).into())
.with_i03((416, 640, 800).into())
.with_confs(&[0.4, 0.15]);
let mut model = YOLO::new(options)?;
// load image
let x = vec![DataLoader::try_read("./assets/bus.jpg")?];
// run
let y = model.run(&x)?;
// annotate
let annotator = Annotator::default().with_saveout("YOLOv10");
annotator.annotate(&x, &y);
Ok(())
}

Binary file not shown.

Before

Width:  |  Height:  |  Size: 395 KiB

View File

@ -1,30 +0,0 @@
use usls::{
models::{YOLOTask, YOLOVersion, YOLO},
Annotator, DataLoader, Options, Vision,
};
fn main() -> Result<(), Box<dyn std::error::Error>> {
// build model
let options = Options::default()
.with_yolo_version(YOLOVersion::V5)
.with_model("../models/yolov5s-seg.onnx")?
.with_yolo_task(YOLOTask::Segment)
// .with_trt(0)
// .with_fp16(true)
.with_i00((1, 1, 4).into())
.with_i02((224, 640, 800).into())
.with_i03((224, 640, 800).into());
let mut model = YOLO::new(options)?;
// load image
let x = vec![DataLoader::try_read("./assets/bus.jpg")?];
// run
let y = model.run(&x)?;
// annotate
let annotator = Annotator::default().with_saveout("YOLOv5");
annotator.annotate(&x, &y);
Ok(())
}

View File

@ -1,13 +0,0 @@
## Quick Start
```shell
cargo run -r --example yolov8-face
```
## ONNX Model
- [yolov8-face-dyn-f16](https://github.com/jamjamjon/assets/releases/download/v0.0.1/yolov8-face-dyn-f16.onnx)
## Results
![](./demo.png)

View File

@ -1,24 +0,0 @@
use usls::{models::YOLO, Annotator, DataLoader, Options, Vision};
fn main() -> Result<(), Box<dyn std::error::Error>> {
// build model
let options = Options::default()
.with_model("yolov8n-face-dyn-f16.onnx")?
.with_i00((1, 1, 4).into())
.with_i02((416, 640, 800).into())
.with_i03((416, 640, 800).into())
.with_confs(&[0.15]);
let mut model = YOLO::new(options)?;
// load image
let x = vec![DataLoader::try_read("./assets/kids.jpg")?];
// run
let y = model.run(&x)?;
// annotate
let annotator = Annotator::default().with_saveout("YOLOv8-Face");
annotator.annotate(&x, &y);
Ok(())
}

View File

@ -1,14 +0,0 @@
## Quick Start
```shell
cargo run -r --example yolov8-falldown
```
## ONNX Model
- [yolov8-falldown-f16](https://github.com/jamjamjon/assets/releases/download/v0.0.1/yolov8-falldown-f16.onnx)
## Results
![](./demo.png)

View File

@ -1,19 +0,0 @@
use usls::{models::YOLO, Annotator, DataLoader, Options, Vision};
fn main() -> Result<(), Box<dyn std::error::Error>> {
// build model
let options = Options::default().with_model("yolov8-falldown-f16.onnx")?;
let mut model = YOLO::new(options)?;
// load image
let x = vec![DataLoader::try_read("./assets/falldown.jpg")?];
// run
let y = model.run(&x)?;
// annotate
let annotator = Annotator::default().with_saveout("YOLOv8-Falldown");
annotator.annotate(&x, &y);
Ok(())
}

View File

@ -1,14 +0,0 @@
## Quick Start
```shell
cargo run -r --example yolov8-head
```
## ONNX Model
- [yolov8-head-f16](https://github.com/jamjamjon/assets/releases/download/v0.0.1/yolov8-head-f16.onnx)
## Results
![](./demo.png)

View File

@ -1,19 +0,0 @@
use usls::{models::YOLO, Annotator, DataLoader, Options, Vision};
fn main() -> Result<(), Box<dyn std::error::Error>> {
// build model
let options = Options::default().with_model("yolov8-head-f16.onnx")?;
let mut model = YOLO::new(options)?;
// load image
let x = vec![DataLoader::try_read("./assets/kids.jpg")?];
// run
let y = model.run(&x)?;
// annotate
let annotator = Annotator::default().with_saveout("YOLOv8-Head");
annotator.annotate(&x, &y);
Ok(())
}

View File

@ -1,16 +0,0 @@
Model for detecting plastic bag.
## Quick Start
```shell
cargo run -r --example yolov8-trash
```
## ONNX Model
- [yolov8-plastic-bag-f16](https://github.com/jamjamjon/assets/releases/download/v0.0.1/yolov8-plastic-bag-f16.onnx)
## Results
![](./demo.png)

View File

@ -1,21 +0,0 @@
use usls::{models::YOLO, Annotator, DataLoader, Options, Vision};
fn main() -> Result<(), Box<dyn std::error::Error>> {
// 1.build model
let options = Options::default()
.with_model("yolov8-plastic-bag-f16.onnx")?
.with_names(&["trash"]);
let mut model = YOLO::new(options)?;
// load image
let x = vec![DataLoader::try_read("./assets/trash.jpg")?];
// run
let y = model.run(&x)?;
// annotate
let annotator = Annotator::default().with_saveout("YOLOv8-Trash");
annotator.annotate(&x, &y);
Ok(())
}

View File

@ -1,35 +0,0 @@
## Quick Start
```shell
cargo run -r --example yolov8
```
## Export `YOLOv8` ONNX Models
```bash
pip install -U ultralytics
# export onnx model with dynamic shapes
yolo export model=yolov8m.pt format=onnx simplify dynamic
yolo export model=yolov8m-cls.pt format=onnx simplify dynamic
yolo export model=yolov8m-pose.pt format=onnx simplify dynamic
yolo export model=yolov8m-seg.pt format=onnx simplify dynamic
yolo export model=yolov8m-obb.pt format=onnx simplify dynamic
# export onnx model with fixed shapes
yolo export model=yolov8m.pt format=onnx simplify
yolo export model=yolov8m-cls.pt format=onnx simplify
yolo export model=yolov8m-pose.pt format=onnx simplify
yolo export model=yolov8m-seg.pt format=onnx simplify
yolo export model=yolov8m-obb.pt format=onnx simplify
```
## Result
| Task | Annotated image |
| :-------------------: | --------------------- |
| Obb | ![img](./demo-obb.png) |
| Instance Segmentation | ![img](./demo-seg.png) |
| Classification | ![img](./demo-cls.png) |
| Detection | ![img](./demo-det.png) |
| Pose | ![img](./demo-pose.png) |

View File

@ -1,45 +0,0 @@
use usls::{coco, models::YOLO, Annotator, DataLoader, Options, Vision};
fn main() -> Result<(), Box<dyn std::error::Error>> {
// build model
let options = Options::default()
// .with_model("yolov8m-dyn.onnx")?
// .with_model("yolov8m-dyn-f16.onnx")?
// .with_model("yolov8m-pose-dyn.onnx")?
// .with_model("yolov8m-cls-dyn.onnx")?
.with_model("yolov8m-seg-dyn.onnx")?
// .with_model("yolov8m-obb-dyn.onnx")?
// .with_model("yolov8m-oiv7-dyn.onnx")?
// .with_trt(0)
// .with_fp16(true)
// .with_coreml(0)
// .with_cuda(3)
.with_i00((1, 1, 4).into())
.with_i02((224, 640, 800).into())
.with_i03((224, 640, 800).into())
.with_confs(&[0.4, 0.15]) // class 0: 0.4, others: 0.15
.with_names2(&coco::KEYPOINTS_NAMES_17)
.with_profile(false);
let mut model = YOLO::new(options)?;
// build dataloader
let dl = DataLoader::default()
.with_batch(1)
.load("./assets/bus.jpg")?;
// .load("./assets/dota.png")?;
// build annotate
let annotator = Annotator::default()
.with_skeletons(&coco::SKELETONS_16)
.with_bboxes_thickness(7)
.with_saveout("YOLOv8");
// run & annotate
for (xs, _paths) in dl {
let ys = model.run(&xs)?;
// let ys = model.forward(&xs, true)?;
annotator.annotate(&xs, &ys);
}
Ok(())
}

View File

@ -1,29 +0,0 @@
## Quick Start
```shell
cargo run -r --example yolov9
```
## Donwload or Export ONNX Model
- **Download**
[yolov9-c-dyn-fp16](https://github.com/jamjamjon/assets/releases/download/v0.0.1/yolov9-c-dyn-f16.onnx)
- **Export**
```shell
# clone repo and install dependencies
git clone https://github.com/WongKinYiu/yolov9.git
cd yolov9
pip install -r requirements.txt
# donwload `pt` weights
wget https://github.com/WongKinYiu/yolov9/releases/download/v0.1/yolov9-c.pt
# export ONNX model
python export.py --weights yolov9-c.pt --include onnx --simplify --dynamic
```
## Results
![](./demo.png)

Binary file not shown.

Before

Width:  |  Height:  |  Size: 450 KiB

View File

@ -1,28 +0,0 @@
use usls::{
models::{YOLOVersion, YOLO},
Annotator, DataLoader, Options, Vision,
};
fn main() -> Result<(), Box<dyn std::error::Error>> {
// build model
let options = Options::default()
.with_model("../models/yolov9-c.onnx")?
.with_yolo_version(YOLOVersion::V9)
.with_i00((1, 1, 4).into())
.with_i02((416, 640, 800).into())
.with_i03((416, 640, 800).into())
.with_confs(&[0.4, 0.15]);
let mut model = YOLO::new(options)?;
// load image
let x = vec![DataLoader::try_read("./assets/bus.jpg")?];
// run
let y = model.run(&x)?;
// annotate
let annotator = Annotator::default().with_saveout("YOLOv9");
annotator.annotate(&x, &y);
Ok(())
}

View File

@ -0,0 +1,63 @@
import cv2
import numpy as np
from pathlib import Path
from tqdm import tqdm
mapping = {
'background': 0,
'skin': 1,
'nose': 2,
'eye_g': 3,
'l_eye': 4,
'r_eye': 5,
'l_brow': 6,
'r_brow': 7,
'l_ear': 8,
'r_ear': 9,
'mouth': 10,
'u_lip': 11,
'l_lip': 12,
'hair': 13,
'hat': 14,
'ear_r': 15,
'neck_l': 16,
'neck': 17,
'cloth': 18
}
def main():
saveout_dir = Path("labels")
if not saveout_dir.exists():
saveout_dir.mkdir()
else:
import shutil
shutil.rmtree(saveout_dir)
saveout_dir.mkdir()
image_list = [x for x in Path("CelebAMask-HQ-mask-anno/").rglob("*.png")]
for image_path in tqdm(image_list, total=len(image_list)):
image_gray = cv2.imread(str(image_path), cv2.IMREAD_GRAYSCALE)
stem = image_path.stem
name, cls_ = stem.split("_", 1)
segments = cv2.findContours(image_gray, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[0]
saveout = saveout_dir / f"{int(name)}.txt"
with open(saveout, 'a+') as f:
for segment in segments:
line = f"{mapping[cls_]}"
segment = segment / 512
for seg in segment:
xn, yn = seg[0]
line += f" {xn} {yn}"
f.write(line + "\n")
if __name__ == "__main__":
main()

View File

@ -226,11 +226,13 @@ impl Annotator {
self
}
/// Plotting polygons' areas or not
pub fn without_polygons(mut self, x: bool) -> Self {
self.without_polygons = x;
self
}
/// Plotting polygons' contours or not
pub fn without_contours(mut self, x: bool) -> Self {
self.without_contours = x;
self
@ -251,6 +253,12 @@ impl Annotator {
self
}
/// Plotting masks or not
pub fn without_masks(mut self, x: bool) -> Self {
self.without_masks = x;
self
}
pub fn with_colormap(mut self, x: &str) -> Self {
let x = match x {
"turbo" | "Turbo" | "TURBO" => colormap256::TURBO,
@ -328,41 +336,41 @@ impl Annotator {
// polygons
if !self.without_polygons {
if let Some(xs) = &y.polygons() {
self.plot_polygons(&mut img_rgba, xs)
self.plot_polygons(&mut img_rgba, xs);
}
}
// masks
if !self.without_masks {
if let Some(xs) = &y.masks() {
self.plot_masks(&mut img_rgba, xs);
}
}
// bboxes
if !self.without_bboxes {
if let Some(xs) = &y.bboxes() {
self.plot_bboxes(&mut img_rgba, xs)
self.plot_bboxes(&mut img_rgba, xs);
}
}
// mbrs
if !self.without_mbrs {
if let Some(xs) = &y.mbrs() {
self.plot_mbrs(&mut img_rgba, xs)
self.plot_mbrs(&mut img_rgba, xs);
}
}
// keypoints
if !self.without_keypoints {
if let Some(xs) = &y.keypoints() {
self.plot_keypoints(&mut img_rgba, xs)
self.plot_keypoints(&mut img_rgba, xs);
}
}
// probs
if let Some(xs) = &y.probs() {
self.plot_probs(&mut img_rgba, xs)
}
// masks
if !self.without_masks {
if let Some(xs) = &y.masks() {
self.plot_masks(&mut img_rgba, xs)
}
self.plot_probs(&mut img_rgba, xs);
}
// save
@ -618,7 +626,7 @@ impl Annotator {
});
image::DynamicImage::from(luma)
} else {
mask.mask().to_owned()
image::DynamicImage::from(mask.mask().to_owned())
};
let luma = luma.resize_exact(
w / scale,

View File

@ -4,7 +4,7 @@ use anyhow::Result;
use crate::{
auto_load,
models::{YOLOTask, YOLOVersion},
models::{YOLOPreds, YOLOTask, YOLOVersion},
Device, MinOptMax,
};
@ -51,8 +51,7 @@ pub struct Options {
pub nm: Option<usize>,
pub confs: Vec<f32>,
pub kconfs: Vec<f32>,
pub iou: f32,
pub apply_nms: bool,
pub iou: Option<f32>,
pub tokenizer: Option<String>,
pub vocab: Option<String>,
pub names: Option<Vec<String>>, // names
@ -63,9 +62,7 @@ pub struct Options {
pub unclip_ratio: f32, // DB
pub yolo_task: Option<YOLOTask>,
pub yolo_version: Option<YOLOVersion>,
pub anchors_first: bool, // yolo model output format like: [batch_size, anchors, xywh_clss_xxx]
pub conf_independent: bool, // xywh_conf_clss
pub apply_probs_softmax: bool,
pub yolo_preds: Option<YOLOPreds>,
}
impl Default for Options {
@ -107,8 +104,7 @@ impl Default for Options {
nm: None,
confs: vec![0.4f32],
kconfs: vec![0.5f32],
iou: 0.45f32,
apply_nms: true,
iou: None,
tokenizer: None,
vocab: None,
names: None,
@ -119,9 +115,7 @@ impl Default for Options {
unclip_ratio: 1.5,
yolo_task: None,
yolo_version: None,
anchors_first: false,
conf_independent: false,
apply_probs_softmax: false,
yolo_preds: None,
}
}
}
@ -172,16 +166,6 @@ impl Options {
self
}
pub fn with_conf_independent(mut self, x: bool) -> Self {
self.conf_independent = x;
self
}
pub fn apply_probs_softmax(mut self, x: bool) -> Self {
self.apply_probs_softmax = x;
self
}
pub fn with_profile(mut self, profile: bool) -> Self {
self.profile = profile;
self
@ -227,13 +211,8 @@ impl Options {
self
}
pub fn with_anchors_first(mut self, x: bool) -> Self {
self.anchors_first = x;
self
}
pub fn with_nms(mut self, apply_nms: bool) -> Self {
self.apply_nms = apply_nms;
pub fn with_yolo_preds(mut self, x: YOLOPreds) -> Self {
self.yolo_preds = Some(x);
self
}
@ -248,7 +227,7 @@ impl Options {
}
pub fn with_iou(mut self, x: f32) -> Self {
self.iou = x;
self.iou = Some(x);
self
}

View File

@ -7,24 +7,29 @@
//!
//! # Supported models
//! | Model | Task / Type | Example | CUDA<br />f32 | CUDA<br />f16 | TensorRT<br />f32 | TensorRT<br />f16 |
//! | :---------------------------------------------------------------: | :-------------------------: | :----------------------: | :-----------: | :-----------: | :------------------------: | :-----------------------: |
//! | [YOLOv5](https://github.com/ultralytics/yolov5) | Object Detection<br />Instance Segmentation<br />Classification | [demo](examples/yolov5) | ✅ | ✅ | ✅ | ✅ |
//! | [YOLOv8-obb](https://github.com/ultralytics/ultralytics) | Object Detection<br />Instance Segmentation<br />Classification<br />Oriented Object Detection<br />Keypoint Detection | [demo](examples/yolov8) | ✅ | ✅ | ✅ | ✅ |
//! | [YOLOv9](https://github.com/WongKinYiu/yolov9) | Object Detection | [demo](examples/yolov9) | ✅ | ✅ | ✅ | ✅ |
//! | [YOLOv10](https://github.com/THU-MIG/yolov10) | Object Detection | [demo](examples/yolov10) | ✅ | ✅ | ✅ | ✅ |
//! | [RT-DETR](https://arxiv.org/abs/2304.08069) | Object Detection | [demo](examples/rtdetr) | ✅ | ✅ | ✅ | ✅ |
//! | [FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM) | Instance Segmentation | [demo](examples/fastsam) | | ✅ | ✅ | ✅ |
//! | [YOLO-World](https://github.com/AILab-CVC/YOLO-World) | Object Detection | [demo](examples/yolo-world) | ✅ | ✅ | ✅ | ✅ |
//! | [DINOv2](https://github.com/facebookresearch/dinov2) | Vision-Self-Supervised | [demo](examples/dinov2) | ✅ | ✅ | ✅ | ✅ |
//! | [CLIP](https://github.com/openai/CLIP) | Vision-Language | [demo](examples/clip) | ✅ | ✅ | ✅ visual<br />❌ textual | ✅ visual<br />❌ textual |
//! | [BLIP](https://github.com/salesforce/BLIP) | Vision-Language | [demo](examples/blip) | ✅ | ✅ | ✅ visual<br />❌ textual | ✅ visual<br />❌ textual |
//! | [DB](https://arxiv.org/abs/1911.08947) | Text Detection | [demo](examples/db) | ✅ | ✅ | ✅ | ✅ |
//! | [SVTR](https://arxiv.org/abs/2205.00159) | Text Recognition | [demo](examples/svtr) | ✅ | ✅ | ✅ | ✅ |
//! | [RTMO](https://github.com/open-mmlab/mmpose/tree/main/projects/rtmo) | Keypoint Detection | [demo](examples/rtmo) | | ✅ | ❌ | ❌ |
//! | [YOLOPv2](https://arxiv.org/abs/2208.11434) | Panoptic Driving Perception | [demo](examples/yolop) | ✅ | ✅ | ✅ | ✅ |
//! | [Depth-Anything](https://github.com/LiheYoung/Depth-Anything) | Monocular Depth Estimation | [demo](examples/depth-anything) | ✅ | ✅ | ❌ | ❌ |
//! | [MODNet](https://github.com/ZHKKKe/MODNet) | Image Matting | [demo](examples/modnet) | ✅ | ✅ | ✅ | ✅ |
//! | Model | Task / Type |
//! | :---------------------------------------------------------------: | :-------------------------: |
//! | [YOLOv5](https://github.com/ultralytics/yolov5) | Object Detection<br />Instance Segmentation<br />Classification |
//! | [YOLOv6](https://github.com/meituan/YOLOv6) | Object Detection |
//! | [YOLOv7](https://github.com/WongKinYiu/yolov7) | Object Detection |
//! | [YOLOv8](https://github.com/ultralytics/ultralytics) | Object Detection<br />Instance Segmentation<br />Classification<br />Oriented Object Detection<br />Keypoint Detection |
//! | [YOLOv9](https://github.com/WongKinYiu/yolov9) | Object Detection |
//! | [YOLOv10](https://github.com/THU-MIG/yolov10) | Object Detection |
//! | [RT-DETR](https://arxiv.org/abs/2304.08069) | Object Detection |
//! | [FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM) | Instance Segmentation |
//! | [YOLO-World](https://github.com/AILab-CVC/YOLO-World) | Object Detection |
//! | [DINOv2](https://github.com/facebookresearch/dinov2) | Vision-Self-Supervised |
//! | [CLIP](https://github.com/openai/CLIP) | Vision-Language |
//! | [BLIP](https://github.com/salesforce/BLIP) | Vision-Language |
//! | [DB](https://arxiv.org/abs/1911.08947) | Text Detection |
//! | [SVTR](https://arxiv.org/abs/2205.00159) | Text Recognition |
//! | [RTMO](https://github.com/open-mmlab/mmpose/tree/main/projects/rtmo) | Keypoint Detection |
//! | [YOLOPv2](https://arxiv.org/abs/2208.11434) | Panoptic Driving Perception |
//! | [Depth-Anything<br />(v1, v2)](https://github.com/LiheYoung/Depth-Anything) | Monocular Depth Estimation |
//! | [MODNet](https://github.com/ZHKKKe/MODNet) | Image Matting |
//! # Examples
//! [All Examples Here](https://github.com/jamjamjon/usls/tree/main/examples)
//! # Use provided models for inference
@ -34,7 +39,9 @@
//! use usls::{coco, models::YOLO, Annotator, DataLoader, Options, Vision};
//!
//! let options = Options::default()
//! .with_model("yolov8m-seg-dyn.onnx")?
//! .with_yolo_version(YOLOVersion::V8) // YOLOVersion: V5, V6, V7, V8, V9, V10, RTDETR
//! .with_yolo_task(YOLOTask::Detect) // YOLOTask: Classify, Detect, Pose, Segment, Obb
//! .with_model("xxxx.onnx")?;
//! .with_trt(0)
//! .with_fp16(true)
//! .with_i00((1, 1, 4).into())
@ -63,8 +70,8 @@
//!
//! ```Rust, no_run
//! let annotator = Annotator::default()
//! .with_bboxes_thickness(7)
//! .with_saveout("YOLOv8");
//! .with_bboxes_thickness(4)
//! .with_saveout("YOLOs");
//! ```
//!
//! #### 4. Run and annotate
@ -111,5 +118,6 @@ mod utils;
mod ys;
pub use core::*;
pub use models::*;
pub use utils::*;
pub use ys::*;

View File

@ -59,7 +59,7 @@ impl Blip {
Ops::Nhwc2nchw,
])?;
let ys = self.visual.run(vec![xs_])?;
Ok(Y::default().with_embedding(Embedding::from(ys[0].to_owned())))
Ok(Y::default().with_embedding(&Embedding::from(ys[0].to_owned())))
}
pub fn caption(

View File

@ -70,7 +70,7 @@ impl Clip {
Ops::Nhwc2nchw,
])?;
let ys = self.visual.run(vec![xs_])?;
Ok(Y::default().with_embedding(Embedding::from(ys[0].to_owned())))
Ok(Y::default().with_embedding(&Embedding::from(ys[0].to_owned())))
}
pub fn encode_texts(&mut self, texts: &[String]) -> Result<Y> {
@ -85,7 +85,7 @@ impl Clip {
let xs = Array2::from_shape_vec((texts.len(), self.context_length), xs)?.into_dyn();
let xs = X::from(xs);
let ys = self.textual.run(vec![xs])?;
Ok(Y::default().with_embedding(Embedding::from(ys[0].to_owned())))
Ok(Y::default().with_embedding(&Embedding::from(ys[0].to_owned())))
}
pub fn batch_visual(&self) -> usize {

View File

@ -71,9 +71,7 @@ impl DepthAnything {
None => continue,
Some(x) => x,
};
ys.push(
Y::default().with_masks(&[Mask::default().with_mask(DynamicImage::from(luma))]),
);
ys.push(Y::default().with_masks(&[Mask::default().with_mask(luma)]));
}
Ok(ys)
}

View File

@ -64,7 +64,7 @@ impl Dinov2 {
Ops::Nhwc2nchw,
])?;
let ys = self.engine.run(vec![xs_])?;
Ok(Y::default().with_embedding(Embedding::from(ys[0].to_owned())))
Ok(Y::default().with_embedding(&Embedding::from(ys[0].to_owned())))
}
// pub fn build_index(&self, metric: Metric) -> Result<usearch::Index> {

View File

@ -10,6 +10,7 @@ mod rtdetr;
mod rtmo;
mod svtr;
mod yolo;
mod yolo_;
mod yolop;
pub use blip::Blip;
@ -21,5 +22,9 @@ pub use modnet::MODNet;
pub use rtdetr::RTDETR;
pub use rtmo::RTMO;
pub use svtr::SVTR;
pub use yolo::{YOLOTask, YOLOVersion, YOLO};
pub use yolo::YOLO;
pub use yolo_::*;
// {
// AnchorsPosition, BoxType, ClssType, KptsType, YOLOFormat, YOLOPreds, YOLOTask, YOLOVersion,
// };
pub use yolop::YOLOPv2;

View File

@ -65,8 +65,6 @@ impl MODNet {
None => continue,
Some(x) => x,
};
let luma = DynamicImage::from(luma);
ys.push(Y::default().with_masks(&[Mask::default().with_mask(luma)]));
}
Ok(ys)

View File

@ -1,54 +1,30 @@
use anyhow::Result;
use clap::ValueEnum;
use image::DynamicImage;
use ndarray::{s, Array, Axis};
use rayon::prelude::*;
use regex::Regex;
use crate::{
Bbox, DynConf, Keypoint, Mbr, MinOptMax, Ops, Options, OrtEngine, Polygon, Prob, Vision, X, Y,
Bbox, BoxType, DynConf, Keypoint, Mask, Mbr, MinOptMax, Ops, Options, OrtEngine, Polygon, Prob,
Vision, YOLOPreds, YOLOTask, YOLOVersion, X, Y,
};
const CXYWH_OFFSET: usize = 4;
const KPT_STEP: usize = 3;
#[derive(Debug, Clone, ValueEnum)]
pub enum YOLOTask {
Classify,
Detect,
Pose,
Segment,
Obb,
}
#[derive(Debug, Copy, Clone, ValueEnum)]
pub enum YOLOVersion {
V5,
V8,
V9,
V10,
Customized,
}
#[derive(Debug)]
pub struct YOLO {
engine: OrtEngine,
nc: usize,
nk: usize,
nm: usize,
height: MinOptMax,
width: MinOptMax,
batch: MinOptMax,
task: YOLOTask,
version: YOLOVersion,
confs: DynConf,
kconfs: DynConf,
iou: f32,
names: Option<Vec<String>>,
names_kpt: Option<Vec<String>>,
apply_nms: bool,
anchors_first: bool,
conf_independent: bool,
apply_probs_softmax: bool,
task: YOLOTask,
layout: YOLOPreds,
version: Option<YOLOVersion>,
}
impl Vision for YOLO {
@ -61,45 +37,63 @@ impl Vision for YOLO {
engine.height().to_owned(),
engine.width().to_owned(),
);
let task = match options.yolo_task {
Some(task) => task,
None => match engine.try_fetch("task") {
None => {
println!("No clear YOLO task specified, using default: Detect");
YOLOTask::Detect
// YOLO Task
let task = options
.yolo_task
.or(engine.try_fetch("task").and_then(|x| match x.as_str() {
"classify" => Some(YOLOTask::Classify),
"detect" => Some(YOLOTask::Detect),
"pose" => Some(YOLOTask::Pose),
"segment" => Some(YOLOTask::Segment),
"obb" => Some(YOLOTask::Obb),
s => {
println!("YOLO Task: {s:?} is unsupported");
None
}
}));
// YOLO Outputs Format
let (version, layout) = match options.yolo_version {
Some(ver) => match &task {
None => anyhow::bail!("No clear YOLO Task specified for Version: {ver:?}."),
Some(task) => match task {
YOLOTask::Classify => match ver {
YOLOVersion::V5 => (Some(ver), YOLOPreds::n_clss().apply_softmax(true)),
YOLOVersion::V8 => (Some(ver), YOLOPreds::n_clss()),
x => anyhow::bail!("YOLOTask::Classify is unsupported for {x:?}. Try using `.with_yolo_preds()` for customization.")
}
YOLOTask::Detect => match ver {
YOLOVersion::V5 | YOLOVersion::V6 | YOLOVersion::V7 => (Some(ver),YOLOPreds::n_a_cxcywh_confclss()),
YOLOVersion::V8 => (Some(ver),YOLOPreds::n_cxcywh_clss_a()),
YOLOVersion::V9 => (Some(ver),YOLOPreds::n_cxcywh_clss_a()),
YOLOVersion::V10 => (Some(ver),YOLOPreds::n_a_xyxy_confcls().apply_nms(false)),
YOLOVersion::RTDETR => (Some(ver),YOLOPreds::n_a_cxcywh_clss_n().apply_nms(false)),
}
YOLOTask::Pose => match ver {
YOLOVersion::V8 => (Some(ver),YOLOPreds::n_cxcywh_clss_xycs_a()),
x => anyhow::bail!("YOLOTask::Pose is unsupported for {x:?}. Try using `.with_yolo_preds()` for customization.")
}
YOLOTask::Segment => match ver {
YOLOVersion::V5 => (Some(ver), YOLOPreds::n_a_cxcywh_confclss_coefs()),
YOLOVersion::V8 => (Some(ver), YOLOPreds::n_cxcywh_clss_coefs_a()),
x => anyhow::bail!("YOLOTask::Segment is unsupported for {x:?}. Try using `.with_yolo_preds()` for customization.")
}
YOLOTask::Obb => match ver {
YOLOVersion::V8 => (Some(ver), YOLOPreds::n_cxcywh_clss_r_a()),
x => anyhow::bail!("YOLOTask::Segment is unsupported for {x:?}. Try using `.with_yolo_preds()` for customization.")
}
}
Some(x) => match x.as_str() {
"classify" => YOLOTask::Classify,
"detect" => YOLOTask::Detect,
"pose" => YOLOTask::Pose,
"segment" => YOLOTask::Segment,
"obb" => YOLOTask::Obb,
x => todo!("YOLO Task: {x:?} is not supported"),
},
},
};
let version = match options.yolo_version {
None => {
println!("No clear YOLO version specified, using default: YOLOv8");
YOLOVersion::V8
}
Some(x) => x,
None => match options.yolo_preds {
None => anyhow::bail!("No clear YOLO version or YOLO Format specified."),
Some(fmt) => (None, fmt)
}
};
// output format
let (anchors_first, conf_independent, apply_nms, apply_probs_softmax) = match version {
YOLOVersion::V5 => (true, true, true, true),
YOLOVersion::V8 | YOLOVersion::V9 => (false, false, true, false),
YOLOVersion::V10 => (true, false, false, false),
YOLOVersion::Customized => (
options.anchors_first,
options.conf_independent,
options.apply_nms,
options.apply_probs_softmax,
),
};
let task = task.unwrap_or(layout.task());
// try from custom class names, and then model metadata
// The number of classes & Class names
let mut names = options.names.or(Self::fetch_names(&engine));
let nc = match options.nc {
Some(nc) => {
@ -109,7 +103,7 @@ impl Vision for YOLO {
assert_eq!(
nc,
names.len(),
"the length of `nc` and `class names` is not equal."
"The length of `nc` and `class names` is not equal."
);
}
}
@ -118,14 +112,15 @@ impl Vision for YOLO {
None => match &names {
Some(names) => names.len(),
None => panic!(
"Can not parse model without `nc` and `class names`. Try to make it explicit."
"Can not parse model without `nc` and `class names`. Try to make it explicit with `options.with_nc(80)`"
),
},
};
let names_kpt = options.names2.or(None);
// Keypoints names
let names_kpt = options.names2;
// try from model metadata
// The number of keypoints
let nk = engine
.try_fetch("kpt_shape")
.map(|kpt_string| {
@ -134,38 +129,33 @@ impl Vision for YOLO {
caps.get(1).unwrap().as_str().parse::<usize>().unwrap()
})
.unwrap_or(0_usize);
let nm = if let YOLOTask::Segment = task {
engine.oshapes()[1][1] as usize
} else {
0_usize
};
let confs = DynConf::new(&options.confs, nc);
let kconfs = DynConf::new(&options.kconfs, nk);
let iou = options.iou.unwrap_or(0.45);
// Summary
println!("YOLO Task: {:?}, Version: {:?}", task, version);
engine.dry_run()?;
Ok(Self {
engine,
confs,
kconfs,
iou: options.iou,
iou,
nc,
nk,
nm,
height,
width,
batch,
task,
version,
names,
names_kpt,
anchors_first,
conf_independent,
apply_nms,
apply_probs_softmax,
layout,
version,
})
}
// pub fn run(&mut self, xs: &[DynamicImage]) -> Result<Vec<Y>> {
fn preprocess(&self, xs: &[Self::Input]) -> Result<Vec<X>> {
let xs_ = match self.task {
YOLOTask::Classify => {
@ -187,238 +177,256 @@ impl Vision for YOLO {
Ops::Nhwc2nchw,
])?,
};
Ok(vec![xs_])
// let ys = self.engine.run(vec![xs_])?;
// self.postprocess(ys, xs)
}
fn inference(&mut self, xs: Vec<X>) -> Result<Vec<X>> {
self.engine.run(xs)
}
// pub fn postprocess(&self, xs: Vec<X>, xs0: &[DynamicImage]) -> Result<Vec<Y>> {
fn postprocess(&self, xs: Vec<X>, xs0: &[Self::Input]) -> Result<Vec<Y>> {
let mut ys = Vec::new();
let protos = if xs.len() == 2 { Some(&xs[1]) } else { None };
for (idx, preds) in xs[0].axis_iter(Axis(0)).enumerate() {
let image_width = xs0[idx].width() as f32;
let image_height = xs0[idx].height() as f32;
let ys: Vec<Y> = xs[0]
.axis_iter(Axis(0))
.into_par_iter()
.enumerate()
.filter_map(|(idx, preds)| {
let mut y = Y::default();
match self.task {
YOLOTask::Classify => {
let y = if self.apply_probs_softmax {
let exps = preds.mapv(|x| x.exp());
// parse preditions
let (
slice_bboxes,
slice_id,
slice_clss,
slice_confs,
slice_kpts,
slice_coefs,
slice_radians,
) = self.layout.parse_preds(preds, self.nc);
// Classifcation
if let YOLOTask::Classify = self.task {
let x = if self.layout.apply_softmax {
let exps = slice_clss.mapv(|x| x.exp());
let stds = exps.sum_axis(Axis(0));
exps / stds
} else {
preds.into_owned()
slice_clss.into_owned()
};
ys.push(
Y::default().with_probs(
Prob::default()
.with_probs(&y.into_raw_vec())
.with_names(self.names.to_owned()),
return Some(
y.with_probs(
&Prob::default()
.with_probs(&x.into_raw_vec())
.with_names(self.names.clone()),
),
);
}
YOLOTask::Obb => {
let mut y_mbrs: Vec<Mbr> = Vec::new();
let ratio = (self.width() as f32 / image_width)
.min(self.height() as f32 / image_height);
for pred in preds.axis_iter(if self.anchors_first { Axis(0) } else { Axis(1) })
{
// xywhclsr
let clss = pred.slice(s![CXYWH_OFFSET..CXYWH_OFFSET + self.nc]);
let radians = pred[pred.len() - 1];
let (id, &confidence) = clss
.into_iter()
.enumerate()
.max_by(|a, b| a.1.total_cmp(b.1))
.unwrap();
if confidence < self.confs[id] {
continue;
}
let xywh = pred.slice(s![0..CXYWH_OFFSET]);
let cx = xywh[0] / ratio;
let cy = xywh[1] / ratio;
let w = xywh[2] / ratio;
let h = xywh[3] / ratio;
let (w, h, radians) = if w > h {
(w, h, radians)
} else {
(h, w, radians + std::f32::consts::PI / 2.)
};
let radians = radians % std::f32::consts::PI;
y_mbrs.push(
Mbr::from_cxcywhr(
cx as f64,
cy as f64,
w as f64,
h as f64,
radians as f64,
)
.with_confidence(confidence)
.with_id(id as isize)
.with_name(self.names.as_ref().map(|names| names[id].to_owned())),
);
}
ys.push(Y::default().with_mbrs(&y_mbrs).apply_mbrs_nms(self.iou));
}
_ => {
let mut y_bboxes: Vec<Bbox> = Vec::new();
let ratio = (self.width() as f32 / image_width)
.min(self.height() as f32 / image_height);
let image_width = xs0[idx].width() as f32;
let image_height = xs0[idx].height() as f32;
let ratio =
(self.width() as f32 / image_width).min(self.height() as f32 / image_height);
// Detection
for (i, pred) in preds
.axis_iter(if self.anchors_first { Axis(0) } else { Axis(1) })
.enumerate()
{
match self.version {
YOLOVersion::V10 => {
let class_id = pred[CXYWH_OFFSET + 1] as usize;
let confidence = pred[CXYWH_OFFSET];
if confidence < self.confs[class_id] {
continue;
}
let bbox = pred.slice(s![0..CXYWH_OFFSET]);
let x = bbox[0] / ratio;
let y = bbox[1] / ratio;
let x2 = bbox[2] / ratio;
let y2 = bbox[3] / ratio;
let w = x2 - x;
let h = y2 - y;
let y_bbox = Bbox::default()
.with_xywh(x, y, w, h)
.with_confidence(confidence)
.with_id(class_id as isize)
.with_id_born(i as isize)
.with_name(
self.names.as_ref().map(|names| names[class_id].to_owned()),
);
y_bboxes.push(y_bbox);
}
_ => {
let (conf_, clss) = if self.conf_independent {
(
pred[CXYWH_OFFSET],
pred.slice(
s![CXYWH_OFFSET + 1..CXYWH_OFFSET + self.nc + 1],
),
)
} else {
(1.0, pred.slice(s![CXYWH_OFFSET..CXYWH_OFFSET + self.nc]))
};
let (id, &confidence) = clss
// Other tasks
let (y_bboxes, y_mbrs) = slice_bboxes?
.axis_iter(Axis(0))
.into_par_iter()
.enumerate()
.filter_map(|(i, bbox)| {
// confidence & class_id
let (class_id, confidence) = match &slice_id {
Some(ids) => (ids[[i, 0]] as _, slice_clss[[i, 0]] as _),
None => {
let (class_id, &confidence) = slice_clss
.slice(s![i, ..])
.into_iter()
.enumerate()
.max_by(|a, b| a.1.total_cmp(b.1))
.unwrap();
let confidence = confidence * conf_;
if confidence < self.confs[id] {
continue;
.max_by(|a, b| a.1.total_cmp(b.1))?;
match &slice_confs {
None => (class_id, confidence),
Some(slice_confs) => {
(class_id, confidence * slice_confs[[i, 0]])
}
}
let bbox = pred.slice(s![0..CXYWH_OFFSET]);
let cx = bbox[0] / ratio;
let cy = bbox[1] / ratio;
let w = bbox[2] / ratio;
let h = bbox[3] / ratio;
let x = cx - w / 2.;
let y = cy - h / 2.;
let x = x.max(0.0).min(image_width);
let y = y.max(0.0).min(image_height);
let y_bbox = Bbox::default()
.with_xywh(x, y, w, h)
.with_confidence(confidence)
.with_id(id as isize)
.with_id_born(i as isize)
.with_name(
self.names.as_ref().map(|names| names[id].to_owned()),
);
y_bboxes.push(y_bbox);
}
};
// filtering
if confidence < self.confs[class_id] {
return None;
}
}
// NMS
let mut y = Y::default().with_bboxes(&y_bboxes);
if self.apply_nms {
y = y.apply_bboxes_nms(self.iou);
}
// Bboxes
let bbox = bbox.mapv(|x| x / ratio);
let bbox = if self.layout.is_bbox_normalized {
(
bbox[0] * self.width() as f32,
bbox[1] * self.height() as f32,
bbox[2] * self.width() as f32,
bbox[3] * self.height() as f32,
)
} else {
(bbox[0], bbox[1], bbox[2], bbox[3])
};
let (cx, cy, x, y, w, h) = match self.layout.box_type()? {
BoxType::Cxcywh => {
let (cx, cy, w, h) = bbox;
let x = (cx - w / 2.).max(0.);
let y = (cy - h / 2.).max(0.);
(cx, cy, x, y, w, h)
}
BoxType::Xyxy => {
let (x, y, x2, y2) = bbox;
let (w, h) = (x2 - x, y2 - y);
let (cx, cy) = ((x + x2) / 2., (y + y2) / 2.);
(cx, cy, x, y, w, h)
}
BoxType::Xywh => {
let (x, y, w, h) = bbox;
let (cx, cy) = (x + w / 2., y + h / 2.);
(cx, cy, x, y, w, h)
}
BoxType::Cxcyxy => {
let (cx, cy, x2, y2) = bbox;
let (w, h) = ((x2 - cx) * 2., (y2 - cy) * 2.);
let x = (x2 - w).max(0.);
let y = (y2 - h).max(0.);
(cx, cy, x, y, w, h)
}
BoxType::XyCxcy => {
let (x, y, cx, cy) = bbox;
let (w, h) = ((cx - x) * 2., (cy - y) * 2.);
(cx, cy, x, y, w, h)
}
};
// Pose
if let YOLOTask::Pose = self.task {
if let Some(bboxes) = y.bboxes() {
let mut y_kpts: Vec<Vec<Keypoint>> = Vec::new();
for bbox in bboxes.iter() {
let pred = if self.anchors_first {
preds.slice(s![
bbox.id_born(),
preds.shape()[1] - KPT_STEP * self.nk..,
])
let (y_bbox, y_mbr) = match &slice_radians {
Some(slice_radians) => {
let radians = slice_radians[[i, 0]];
let (w, h, radians) = if w > h {
(w, h, radians)
} else {
preds.slice(s![
preds.shape()[0] - KPT_STEP * self.nk..,
bbox.id_born(),
])
(h, w, radians + std::f32::consts::PI / 2.)
};
let radians = radians % std::f32::consts::PI;
(
None,
Some(
Mbr::from_cxcywhr(
cx as f64,
cy as f64,
w as f64,
h as f64,
radians as f64,
)
.with_confidence(confidence)
.with_id(class_id as isize)
.with_name(
self.names
.as_ref()
.map(|names| names[class_id].clone()),
),
),
)
}
None => (
Some(
Bbox::default()
.with_xywh(x, y, w, h)
.with_confidence(confidence)
.with_id(class_id as isize)
.with_id_born(i as isize)
.with_name(
self.names
.as_ref()
.map(|names| names[class_id].clone()),
),
),
None,
),
};
let mut kpts_: Vec<Keypoint> = Vec::new();
for i in 0..self.nk {
let kx = pred[KPT_STEP * i] / ratio;
let ky = pred[KPT_STEP * i + 1] / ratio;
let kconf = pred[KPT_STEP * i + 2];
if kconf < self.kconfs[i] {
kpts_.push(Keypoint::default());
} else {
kpts_.push(
Some((y_bbox, y_mbr))
})
.collect::<(Vec<_>, Vec<_>)>();
let y_bboxes: Vec<Bbox> = y_bboxes.into_iter().flatten().collect();
let y_mbrs: Vec<Mbr> = y_mbrs.into_iter().flatten().collect();
// Mbrs
if !y_mbrs.is_empty() {
y = y.with_mbrs(&y_mbrs);
if self.layout.apply_nms {
y = y.apply_nms(self.iou);
}
return Some(y);
}
// Bboxes
if !y_bboxes.is_empty() {
y = y.with_bboxes(&y_bboxes);
if self.layout.apply_nms {
y = y.apply_nms(self.iou);
}
}
// Pose
if let Some(pred_kpts) = slice_kpts {
let kpt_step = self.layout.kpt_step().unwrap_or(3);
if let Some(bboxes) = y.bboxes() {
let y_kpts = bboxes
.into_par_iter()
.filter_map(|bbox| {
let pred = pred_kpts.slice(s![bbox.id_born(), ..]);
let kpts = (0..self.nk)
.into_par_iter()
.map(|i| {
let kx = pred[kpt_step * i] / ratio;
let ky = pred[kpt_step * i + 1] / ratio;
let kconf = pred[kpt_step * i + 2];
if kconf < self.kconfs[i] {
Keypoint::default()
} else {
Keypoint::default()
.with_id(i as isize)
.with_confidence(kconf)
.with_name(
self.names_kpt
.as_ref()
.map(|names| names[i].to_owned()),
.map(|names| names[i].clone()),
)
.with_xy(
kx.max(0.0f32).min(image_width),
ky.max(0.0f32).min(image_height),
),
);
}
}
y_kpts.push(kpts_);
}
y = y.with_keypoints(&y_kpts);
}
)
}
})
.collect::<Vec<_>>();
Some(kpts)
})
.collect::<Vec<_>>();
y = y.with_keypoints(&y_kpts);
}
}
// Segment
if let YOLOTask::Segment = self.task {
if let Some(bboxes) = y.bboxes() {
let mut y_polygons: Vec<Polygon> = Vec::new();
for bbox in bboxes.iter() {
let coefs = if self.anchors_first {
preds
.slice(s![bbox.id_born(), preds.shape()[1] - self.nm..])
.to_vec()
} else {
preds
.slice(s![preds.shape()[0] - self.nm.., bbox.id_born()])
.to_vec()
};
// Segment
if let Some(coefs) = slice_coefs {
if let Some(bboxes) = y.bboxes() {
let (y_polygons, y_masks) = bboxes
.into_par_iter()
.filter_map(|bbox| {
let coefs = coefs.slice(s![bbox.id_born(), ..]).to_vec();
let proto = protos.unwrap().slice(s![idx, .., .., ..]);
let proto = protos.as_ref()?.slice(s![idx, .., .., ..]);
let (nm, mh, mw) = proto.dim();
// coefs * proto => mask (311.427µs)
let coefs = Array::from_shape_vec((1, nm), coefs)?; // (n, nm)
let proto = proto.into_shape((nm, mh * mw))?; // (nm, mh * mw)
// coefs * proto => mask
let coefs = Array::from_shape_vec((1, nm), coefs).ok()?; // (n, nm)
let proto = proto.into_shape((nm, mh * mw)).ok()?; // (nm, mh * mw)
let mask = coefs.dot(&proto); // (mh, mw, n)
// de-scale
// Mask rescale
let mask = Ops::resize_lumaf32_vec(
&mask.into_raw_vec(),
mw as _,
@ -427,22 +435,19 @@ impl Vision for YOLO {
image_height as _,
true,
"Bilinear",
)?;
)
.ok()?;
let mut mask: image::ImageBuffer<image::Luma<_>, Vec<_>> =
match image::ImageBuffer::from_raw(
image::ImageBuffer::from_raw(
image_width as _,
image_height as _,
mask,
) {
None => continue,
Some(x) => x,
};
)?;
let (xmin, ymin, xmax, ymax) =
(bbox.xmin(), bbox.ymin(), bbox.xmax(), bbox.ymax());
// Using bbox to crop the mask (75.93µs)
// Using bbox to crop the mask
for (y, row) in mask.enumerate_rows_mut() {
for (x, _, pixel) in row {
if x < xmin as _
@ -455,32 +460,36 @@ impl Vision for YOLO {
}
}
// Find contours (1.413853ms)
// Find contours
let contours: Vec<imageproc::contours::Contour<i32>> =
imageproc::contours::find_contours_with_threshold(&mask, 0);
let polygon = match contours
.iter()
.map(|x| {
Polygon::default()
.with_id(bbox.id())
.with_points_imageproc(&x.points)
.with_name(bbox.name().cloned())
})
.max_by(|x, y| x.area().total_cmp(&y.area()))
{
None => continue,
Some(x) => x,
};
y_polygons.push(polygon);
}
y = y.with_polygons(&y_polygons);
}
Some((
contours
.into_par_iter()
.map(|x| {
Polygon::default()
.with_id(bbox.id())
.with_points_imageproc(&x.points)
.with_name(bbox.name().cloned())
})
.max_by(|x, y| x.area().total_cmp(&y.area()))?,
Mask::default()
.with_mask(mask)
.with_id(bbox.id())
.with_name(bbox.name().cloned()),
))
})
.collect::<(Vec<_>, Vec<_>)>();
y = y.with_polygons(&y_polygons).with_masks(&y_masks);
}
ys.push(y);
}
}
}
Some(y)
})
.collect();
Ok(ys)
}
}
@ -498,6 +507,18 @@ impl YOLO {
self.height.opt
}
pub fn version(&self) -> Option<&YOLOVersion> {
self.version.as_ref()
}
pub fn task(&self) -> &YOLOTask {
&self.task
}
pub fn layout(&self) -> &YOLOPreds {
&self.layout
}
fn fetch_names(engine: &OrtEngine) -> Option<Vec<String>> {
// fetch class names from onnx metadata
// String format: `{0: 'person', 1: 'bicycle', 2: 'sports ball', ..., 27: "yellow_lady's_slipper"}`

318
src/models/yolo_.rs Normal file
View File

@ -0,0 +1,318 @@
use ndarray::{ArrayBase, ArrayView, Axis, Dim, IxDyn, IxDynImpl, ViewRepr};
#[derive(Debug, Clone, clap::ValueEnum)]
pub enum YOLOTask {
Classify,
Detect,
Pose,
Segment,
Obb,
}
#[derive(Debug, Copy, Clone, clap::ValueEnum)]
pub enum YOLOVersion {
V5,
V6,
V7,
V8,
V9,
V10,
RTDETR,
}
#[derive(Debug, Clone, PartialEq)]
pub enum BoxType {
/// 1
Cxcywh,
/// 2 Cxcybr
Cxcyxy,
/// 3 Tlbr
Xyxy,
/// 4 Tlwh
Xywh,
/// 5 Tlcxcy
XyCxcy,
}
#[derive(Debug, Clone, PartialEq)]
pub enum ClssType {
Clss,
ConfCls,
ClsConf,
ConfClss,
ClssConf,
}
#[derive(Debug, Clone, PartialEq)]
pub enum KptsType {
Xys,
Xycs,
}
#[derive(Debug, Clone, PartialEq)]
pub enum AnchorsPosition {
Before,
After,
}
#[derive(Debug, Clone, PartialEq)]
pub struct YOLOPreds {
pub clss: ClssType,
pub bbox: Option<BoxType>,
pub kpts: Option<KptsType>,
pub coefs: Option<bool>,
pub obb: Option<bool>,
pub anchors: Option<AnchorsPosition>,
pub is_bbox_normalized: bool,
pub apply_nms: bool,
pub apply_softmax: bool,
}
impl Default for YOLOPreds {
fn default() -> Self {
Self {
clss: ClssType::Clss,
bbox: None,
kpts: None,
coefs: None,
obb: None,
anchors: None,
is_bbox_normalized: false,
apply_nms: true,
apply_softmax: false,
}
}
}
impl YOLOPreds {
pub fn apply_nms(mut self, x: bool) -> Self {
self.apply_nms = x;
self
}
pub fn apply_softmax(mut self, x: bool) -> Self {
self.apply_softmax = x;
self
}
pub fn n_clss() -> Self {
// Classification: NClss
Self {
clss: ClssType::Clss,
..Default::default()
}
}
pub fn n_a_cxcywh_confclss() -> Self {
// YOLOv5 | YOLOv6 | YOLOv7 | YOLOX : NACxcywhConfClss
Self {
bbox: Some(BoxType::Cxcywh),
clss: ClssType::ConfClss,
anchors: Some(AnchorsPosition::Before),
..Default::default()
}
}
pub fn n_a_cxcywh_confclss_coefs() -> Self {
// YOLOv5 Segment : NACxcywhConfClssCoefs
Self {
bbox: Some(BoxType::Cxcywh),
clss: ClssType::ConfClss,
coefs: Some(true),
anchors: Some(AnchorsPosition::Before),
..Default::default()
}
}
pub fn n_cxcywh_clss_a() -> Self {
// YOLOv8 | YOLOv9 : NCxcywhClssA
Self {
bbox: Some(BoxType::Cxcywh),
clss: ClssType::Clss,
anchors: Some(AnchorsPosition::After),
..Default::default()
}
}
pub fn n_a_xyxy_confcls() -> Self {
// YOLOv10 : NAXyxyConfCls
Self {
bbox: Some(BoxType::Xyxy),
clss: ClssType::ConfCls,
anchors: Some(AnchorsPosition::Before),
..Default::default()
}
}
pub fn n_a_cxcywh_clss_n() -> Self {
// RTDETR
Self {
bbox: Some(BoxType::Cxcywh),
clss: ClssType::Clss,
anchors: Some(AnchorsPosition::Before),
is_bbox_normalized: true,
..Default::default()
}
}
pub fn n_cxcywh_clss_xycs_a() -> Self {
// YOLOv8 Pose : NCxcywhClssXycsA
Self {
bbox: Some(BoxType::Cxcywh),
clss: ClssType::Clss,
kpts: Some(KptsType::Xycs),
anchors: Some(AnchorsPosition::After),
..Default::default()
}
}
pub fn n_cxcywh_clss_coefs_a() -> Self {
// YOLOv8 Segment : NCxcywhClssCoefsA
Self {
bbox: Some(BoxType::Cxcywh),
clss: ClssType::Clss,
coefs: Some(true),
anchors: Some(AnchorsPosition::After),
..Default::default()
}
}
pub fn n_cxcywh_clss_r_a() -> Self {
// YOLOv8 Obb : NCxcywhClssRA
Self {
bbox: Some(BoxType::Cxcywh),
clss: ClssType::Clss,
obb: Some(true),
anchors: Some(AnchorsPosition::After),
..Default::default()
}
}
pub fn task(&self) -> YOLOTask {
match self.obb {
Some(_) => YOLOTask::Obb,
None => match self.coefs {
Some(_) => YOLOTask::Segment,
None => match self.kpts {
Some(_) => YOLOTask::Pose,
None => match self.bbox {
Some(_) => YOLOTask::Detect,
None => YOLOTask::Classify,
},
},
},
}
}
pub fn box_type(&self) -> Option<&BoxType> {
self.bbox.as_ref()
}
pub fn is_anchors_first(&self) -> bool {
matches!(self.anchors, Some(AnchorsPosition::Before))
}
pub fn is_cls_type(&self) -> bool {
matches!(self.clss, ClssType::ClsConf | ClssType::ConfCls)
}
pub fn is_clss_type(&self) -> bool {
matches!(
self.clss,
ClssType::ClssConf | ClssType::ConfClss | ClssType::Clss
)
}
pub fn is_conf_at_end(&self) -> bool {
matches!(self.clss, ClssType::ClssConf | ClssType::ClsConf)
}
pub fn is_conf_independent(&self) -> bool {
!matches!(self.clss, ClssType::Clss)
}
pub fn kpt_step(&self) -> Option<usize> {
match &self.kpts {
Some(x) => match x {
KptsType::Xycs => Some(3),
KptsType::Xys => Some(2),
},
None => None,
}
}
#[allow(clippy::type_complexity)]
pub fn parse_preds<'a>(
&'a self,
x: ArrayBase<ViewRepr<&'a f32>, Dim<IxDynImpl>>,
nc: usize,
) -> (
Option<ArrayView<f32, IxDyn>>,
Option<ArrayView<f32, IxDyn>>,
ArrayView<f32, IxDyn>,
Option<ArrayView<f32, IxDyn>>,
Option<ArrayView<f32, IxDyn>>,
Option<ArrayView<f32, IxDyn>>,
Option<ArrayView<f32, IxDyn>>,
) {
match self.task() {
YOLOTask::Classify => (None, None, x, None, None, None, None),
_ => {
let x = if self.is_anchors_first() {
x
} else {
x.reversed_axes()
};
// get each tasks slices
let (slice_bboxes, xs) = x.split_at(Axis(1), 4);
let (slice_id, slice_clss, slice_confs, xs) = match self.clss {
ClssType::ConfClss => {
let (confs, xs) = xs.split_at(Axis(1), 1);
let (clss, xs) = xs.split_at(Axis(1), nc);
(None, clss, Some(confs), xs)
}
ClssType::ClssConf => {
let (clss, xs) = xs.split_at(Axis(1), nc);
let (confs, xs) = xs.split_at(Axis(1), 1);
(None, clss, Some(confs), xs)
}
ClssType::ConfCls => {
let (clss, xs) = xs.split_at(Axis(1), 1);
let (ids, xs) = xs.split_at(Axis(1), 1);
(Some(ids), clss, None, xs)
}
ClssType::ClsConf => {
let (ids, xs) = xs.split_at(Axis(1), 1);
let (clss, xs) = xs.split_at(Axis(1), 1);
(Some(ids), clss, None, xs)
}
ClssType::Clss => {
let (clss, xs) = xs.split_at(Axis(1), nc);
(None, clss, None, xs)
}
};
let (slice_kpts, slice_coefs, slice_radians) = match self.task() {
YOLOTask::Pose => (Some(xs), None, None),
YOLOTask::Segment => (None, Some(xs), None),
YOLOTask::Obb => (None, None, Some(xs)),
_ => (None, None, None),
};
(
Some(slice_bboxes),
slice_id,
slice_clss,
slice_confs,
slice_kpts,
slice_coefs,
slice_radians,
)
}
}
}
}

View File

@ -22,8 +22,8 @@ impl YOLOPv2 {
engine.height().to_owned(),
engine.width().to_owned(),
);
let nc = 80;
let confs = DynConf::new(&options.kconfs, nc);
let confs = DynConf::new(&options.kconfs, 80);
let iou = options.iou.unwrap_or(0.45f32);
engine.dry_run()?;
Ok(Self {
@ -32,7 +32,7 @@ impl YOLOPv2 {
height,
width,
batch,
iou: options.iou,
iou,
})
}
@ -162,7 +162,7 @@ impl YOLOPv2 {
Y::default()
.with_bboxes(&y_bboxes)
.with_polygons(&y_polygons)
.apply_bboxes_nms(self.iou),
.apply_nms(self.iou),
);
}
Ok(ys)

View File

@ -1,4 +1,9 @@
use crate::Nms;
/// Bounding Box 2D.
///
/// This struct represents a 2D bounding box with properties such as position, size,
/// class ID, confidence score, optional name, and an ID representing the born state.
#[derive(Clone, PartialEq, PartialOrd)]
pub struct Bbox {
x: f32,
@ -10,6 +15,17 @@ pub struct Bbox {
name: Option<String>,
id_born: isize,
}
impl Nms for Bbox {
/// Returns the confidence score of the bounding box.
fn confidence(&self) -> f32 {
self.confidence
}
/// Computes the intersection over union (IoU) between this bounding box and another.
fn iou(&self, other: &Self) -> f32 {
self.intersect(other) / self.union(other)
}
}
impl Default for Bbox {
fn default() -> Self {
@ -30,8 +46,7 @@ impl std::fmt::Debug for Bbox {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.debug_struct("Bbox")
.field("xyxy", &[self.x, self.y, self.xmax(), self.ymax()])
.field("id", &self.id)
// .field("id_born", &self.id_born)
.field("class_id", &self.id)
.field("name", &self.name)
.field("confidence", &self.confidence)
.finish()
@ -39,6 +54,15 @@ impl std::fmt::Debug for Bbox {
}
impl From<(f32, f32, f32, f32)> for Bbox {
/// Creates a `Bbox` from a tuple of `(x, y, w, h)`.
///
/// # Arguments
///
/// * `(x, y, w, h)` - A tuple representing the bounding box's position and size.
///
/// # Returns
///
/// A `Bbox` with the specified position and size.
fn from((x, y, w, h): (f32, f32, f32, f32)) -> Self {
Self {
x,
@ -51,6 +75,15 @@ impl From<(f32, f32, f32, f32)> for Bbox {
}
impl From<[f32; 4]> for Bbox {
/// Creates a `Bbox` from an array of `[x, y, w, h]`.
///
/// # Arguments
///
/// * `[x, y, w, h]` - An array representing the bounding box's position and size.
///
/// # Returns
///
/// A `Bbox` with the specified position and size.
fn from([x, y, w, h]: [f32; 4]) -> Self {
Self {
x,
@ -63,6 +96,15 @@ impl From<[f32; 4]> for Bbox {
}
impl From<(f32, f32, f32, f32, isize, f32)> for Bbox {
/// Creates a `Bbox` from a tuple of `(x, y, w, h, id, confidence)`.
///
/// # Arguments
///
/// * `(x, y, w, h, id, confidence)` - A tuple representing the bounding box's position, size, class ID, and confidence score.
///
/// # Returns
///
/// A `Bbox` with the specified position, size, class ID, and confidence score.
fn from((x, y, w, h, id, confidence): (f32, f32, f32, f32, isize, f32)) -> Self {
Self {
x,
@ -77,6 +119,18 @@ impl From<(f32, f32, f32, f32, isize, f32)> for Bbox {
}
impl Bbox {
/// Sets the bounding box's coordinates using `(x1, y1, x2, y2)` and calculates width and height.
///
/// # Arguments
///
/// * `x1` - The x-coordinate of the top-left corner.
/// * `y1` - The y-coordinate of the top-left corner.
/// * `x2` - The x-coordinate of the bottom-right corner.
/// * `y2` - The y-coordinate of the bottom-right corner.
///
/// # Returns
///
/// A `Bbox` instance with updated coordinates and dimensions.
pub fn with_xyxy(mut self, x1: f32, y1: f32, x2: f32, y2: f32) -> Self {
self.x = x1;
self.y = y1;
@ -85,6 +139,18 @@ impl Bbox {
self
}
/// Sets the bounding box's coordinates and dimensions using `(x, y, w, h)`.
///
/// # Arguments
///
/// * `x` - The x-coordinate of the top-left corner.
/// * `y` - The y-coordinate of the top-left corner.
/// * `w` - The width of the bounding box.
/// * `h` - The height of the bounding box.
///
/// # Returns
///
/// A `Bbox` instance with updated coordinates and dimensions.
pub fn with_xywh(mut self, x: f32, y: f32, w: f32, h: f32) -> Self {
self.x = x;
self.y = y;
@ -93,74 +159,138 @@ impl Bbox {
self
}
/// Sets the class ID of the bounding box.
///
/// # Arguments
///
/// * `x` - The class ID to be set.
///
/// # Returns
///
/// A `Bbox` instance with updated class ID.
pub fn with_id(mut self, x: isize) -> Self {
self.id = x;
self
}
/// Sets the ID representing the born state of the bounding box.
///
/// # Arguments
///
/// * `x` - The ID to be set.
///
/// # Returns
///
/// A `Bbox` instance with updated born state ID.
pub fn with_id_born(mut self, x: isize) -> Self {
self.id_born = x;
self
}
/// Sets the confidence score of the bounding box.
///
/// # Arguments
///
/// * `x` - The confidence score to be set.
///
/// # Returns
///
/// A `Bbox` instance with updated confidence score.
pub fn with_confidence(mut self, x: f32) -> Self {
self.confidence = x;
self
}
/// Sets the optional name of the bounding box.
///
/// # Arguments
///
/// * `x` - The optional name to be set.
///
/// # Returns
///
/// A `Bbox` instance with updated name.
pub fn with_name(mut self, x: Option<String>) -> Self {
self.name = x;
self
}
/// Returns the width of the bounding box.
pub fn width(&self) -> f32 {
self.w
}
/// Returns the height of the bounding box.
pub fn height(&self) -> f32 {
self.h
}
/// Returns the minimum x-coordinate of the bounding box.
pub fn xmin(&self) -> f32 {
self.x
}
/// The minimum y-coordinate of the bounding box.
pub fn ymin(&self) -> f32 {
self.y
}
/// Returns the maximum x-coordinate of the bounding box.
pub fn xmax(&self) -> f32 {
self.x + self.w
}
/// The maximum x-coordinate of the bounding box.
pub fn ymax(&self) -> f32 {
self.y + self.h
}
/// Returns the center x-coordinate of the bounding box.
pub fn cx(&self) -> f32 {
self.x + self.w / 2.
}
/// Returns the center y-coordinate of the bounding box.
pub fn cy(&self) -> f32 {
self.y + self.h / 2.
}
/// Returns the bounding box coordinates as `(x1, y1, x2, y2)`.
pub fn xyxy(&self) -> (f32, f32, f32, f32) {
(self.x, self.y, self.x + self.w, self.y + self.h)
}
/// Returns the bounding box coordinates and size as `(x, y, w, h)`.
pub fn xywh(&self) -> (f32, f32, f32, f32) {
(self.x, self.y, self.w, self.h)
}
/// Returns the center coordinates and size of the bounding box as `(cx, cy, w, h)`.
pub fn cxywh(&self) -> (f32, f32, f32, f32) {
(self.cx(), self.cy(), self.w, self.h)
}
/// Returns the class ID of the bounding box.
pub fn id(&self) -> isize {
self.id
}
/// Returns the born state ID of the bounding box.
pub fn id_born(&self) -> isize {
self.id_born
}
/// Returns the optional name associated with the bounding box, if any.
pub fn name(&self) -> Option<&String> {
self.name.as_ref()
}
pub fn confidence(&self) -> f32 {
self.confidence
}
// /// Returns the confidence score of the bounding box.
// pub fn confidence(&self) -> f32 {
// self.confidence
// }
/// A label string representing the bounding box, optionally including name and confidence score.
pub fn label(&self, with_name: bool, with_conf: bool, decimal_places: usize) -> String {
let mut label = String::new();
if with_name {
@ -182,18 +312,22 @@ impl Bbox {
label
}
/// Computes the area of the bounding box.
pub fn area(&self) -> f32 {
self.h * self.w
}
/// Computes the perimeter of the bounding box.
pub fn perimeter(&self) -> f32 {
(self.h + self.w) * 2.0
}
/// Checks if the bounding box is square (i.e., width equals height).
pub fn is_squre(&self) -> bool {
self.w == self.h
}
/// Computes the intersection area between this bounding box and another.
pub fn intersect(&self, other: &Bbox) -> f32 {
let l = self.xmin().max(other.xmin());
let r = (self.xmin() + self.width()).min(other.xmin() + other.width());
@ -202,14 +336,17 @@ impl Bbox {
(r - l).max(0.) * (b - t).max(0.)
}
/// Computes the union area between this bounding box and another.
pub fn union(&self, other: &Bbox) -> f32 {
self.area() + other.area() - self.intersect(other)
}
pub fn iou(&self, other: &Bbox) -> f32 {
self.intersect(other) / self.union(other)
}
// /// Computes the intersection over union (IoU) between this bounding box and another.
// pub fn iou(&self, other: &Bbox) -> f32 {
// self.intersect(other) / self.union(other)
// }
/// Checks if this bounding box completely contains another bounding box `other`.
pub fn contains(&self, other: &Bbox) -> bool {
self.xmin() <= other.xmin()
&& self.xmax() >= other.xmax()

View File

@ -1,20 +1,18 @@
use image::DynamicImage;
use image::GrayImage;
/// Gray-Scale Mask.
/// Mask: Gray Image.
#[derive(Clone, PartialEq)]
pub struct Mask {
mask: DynamicImage,
mask_vec: Vec<u8>,
mask: GrayImage,
id: isize,
name: Option<String>,
confidence: f32, // placeholder
confidence: f32,
}
impl Default for Mask {
fn default() -> Self {
Self {
mask: DynamicImage::default(),
mask_vec: vec![],
mask: GrayImage::default(),
id: -1,
name: None,
confidence: 0.,
@ -25,25 +23,19 @@ impl Default for Mask {
impl std::fmt::Debug for Mask {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.debug_struct("Mask")
// .field("mask", &self.mask)
.field("dimensions", &self.dimensions())
.field("id", &self.id)
.field("name", &self.name)
// .field("confidence", &self.confidence)
.finish()
}
}
impl Mask {
pub fn with_mask(mut self, x: DynamicImage) -> Self {
pub fn with_mask(mut self, x: GrayImage) -> Self {
self.mask = x;
self
}
pub fn with_vec(mut self, vec: &[u8]) -> Self {
self.mask_vec = vec.to_vec();
self
}
pub fn with_id(mut self, x: isize) -> Self {
self.id = x;
self
@ -54,13 +46,12 @@ impl Mask {
self
}
pub fn mask(&self) -> &DynamicImage {
pub fn mask(&self) -> &GrayImage {
&self.mask
}
pub fn vec(&self) -> Vec<u8> {
// self.mask.to_luma8().into_raw()
self.mask_vec.clone()
pub fn to_vec(&self) -> Vec<u8> {
self.mask.to_vec()
}
pub fn id(&self) -> isize {
@ -74,4 +65,16 @@ impl Mask {
pub fn confidence(&self) -> f32 {
self.confidence
}
pub fn height(&self) -> u32 {
self.mask.height()
}
pub fn width(&self) -> u32 {
self.mask.width()
}
pub fn dimensions(&self) -> (u32, u32) {
self.mask.dimensions()
}
}

View File

@ -1,5 +1,7 @@
use geo::{coord, line_string, Area, BooleanOps, Coord, EuclideanDistance, LineString, Polygon};
use crate::Nms;
/// Minimum Bounding Rectangle.
#[derive(Clone, PartialEq)]
pub struct Mbr {
@ -9,6 +11,18 @@ pub struct Mbr {
name: Option<String>,
}
impl Nms for Mbr {
/// Returns the confidence score of the bounding box.
fn confidence(&self) -> f32 {
self.confidence
}
/// Computes the intersection over union (IoU) between this bounding box and another.
fn iou(&self, other: &Self) -> f32 {
self.intersect(other) / self.union(other)
}
}
impl Default for Mbr {
fn default() -> Self {
Self {
@ -100,10 +114,6 @@ impl Mbr {
self.name.as_ref()
}
pub fn confidence(&self) -> f32 {
self.confidence
}
pub fn label(&self, with_name: bool, with_conf: bool, decimal_places: usize) -> String {
let mut label = String::new();
if with_name {
@ -195,15 +205,12 @@ impl Mbr {
let p2 = Polygon::new(other.ls.clone(), vec![]);
p1.union(&p2).unsigned_area() as f32
}
pub fn iou(&self, other: &Mbr) -> f32 {
self.intersect(other) / self.union(other)
}
}
#[cfg(test)]
mod tests_mbr {
use super::Mbr;
use crate::Nms;
use geo::{coord, line_string};
#[test]

View File

@ -15,3 +15,8 @@ pub use mbr::Mbr;
pub use polygon::Polygon;
pub use prob::Prob;
pub use y::Y;
pub trait Nms {
fn iou(&self, other: &Self) -> f32;
fn confidence(&self) -> f32;
}

View File

@ -28,7 +28,9 @@ impl Default for Polygon {
impl std::fmt::Debug for Polygon {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.debug_struct("Polygon")
// .field("polygons", &self.polygon)
.field("perimeter", &self.perimeter())
.field("area", &self.area())
.field("count", &self.count())
.field("id", &self.id)
.field("name", &self.name)
.field("confidence", &self.confidence)

View File

@ -1,6 +1,21 @@
use crate::{Bbox, Embedding, Keypoint, Mask, Mbr, Polygon, Prob};
use crate::{Bbox, Embedding, Keypoint, Mask, Mbr, Nms, Polygon, Prob};
/// Inference results container for each image.
/// Container for inference results for each image.
///
/// This struct holds various possible outputs from an image inference process,
/// including probabilities, bounding boxes, keypoints, minimum bounding rectangles,
/// polygons, masks, text annotations, and embeddings.
///
/// # Fields
///
/// * `probs` - Optionally contains the probability scores for the detected objects.
/// * `bboxes` - Optionally contains a vector of bounding boxes.
/// * `keypoints` - Optionally contains a nested vector of keypoints.
/// * `mbrs` - Optionally contains a vector of minimum bounding rectangles.
/// * `polygons` - Optionally contains a vector of polygons.
/// * `texts` - Optionally contains a vector of text annotations.
/// * `masks` - Optionally contains a vector of masks.
/// * `embedding` - Optionally contains the embedding representation.
#[derive(Clone, PartialEq, Default)]
pub struct Y {
probs: Option<Prob>,
@ -15,7 +30,7 @@ pub struct Y {
impl std::fmt::Debug for Y {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
let mut f = f.debug_struct("Result");
let mut f = f.debug_struct("Y");
if let Some(x) = &self.texts {
if !x.is_empty() {
f.field("Texts", &x);
@ -57,139 +72,234 @@ impl std::fmt::Debug for Y {
}
impl Y {
/// Sets the `masks` field with the provided vector of masks.
///
/// # Arguments
///
/// * `masks` - A slice of `Mask` to be set.
///
/// # Returns
///
/// * `Self` - The updated struct instance with the new masks set.
pub fn with_masks(mut self, masks: &[Mask]) -> Self {
self.masks = Some(masks.to_vec());
self
}
pub fn with_probs(mut self, probs: Prob) -> Self {
self.probs = Some(probs);
/// Sets the `probs` field with the provided probability scores.
///
/// # Arguments
///
/// * `probs` - A reference to a `Prob` instance to be cloned and set in the struct.
///
/// # Returns
///
/// * `Self` - The updated struct instance with the new probabilities set.
///
/// # Examples
///
/// ```
/// let probs = Prob::default();
/// let y = Y::default().with_probs(&probs);
/// ```
pub fn with_probs(mut self, probs: &Prob) -> Self {
self.probs = Some(probs.clone());
self
}
/// Sets the `texts` field with the provided vector of text annotations.
///
/// # Arguments
///
/// * `texts` - A slice of `String` to be set.
///
/// # Returns
///
/// * `Self` - The updated struct instance with the new texts set.
pub fn with_texts(mut self, texts: &[String]) -> Self {
self.texts = Some(texts.to_vec());
self
}
/// Sets the `mbrs` field with the provided vector of minimum bounding rectangles.
///
/// # Arguments
///
/// * `mbrs` - A slice of `Mbr` to be set.
///
/// # Returns
///
/// * `Self` - The updated struct instance with the new minimum bounding rectangles set.
pub fn with_mbrs(mut self, mbrs: &[Mbr]) -> Self {
self.mbrs = Some(mbrs.to_vec());
self
}
/// Sets the `bboxes` field with the provided vector of bounding boxes.
///
/// # Arguments
///
/// * `bboxes` - A slice of `Bbox` to be set.
///
/// # Returns
///
/// * `Self` - The updated struct instance with the new bounding boxes set.
pub fn with_bboxes(mut self, bboxes: &[Bbox]) -> Self {
self.bboxes = Some(bboxes.to_vec());
self
}
pub fn with_embedding(mut self, embedding: Embedding) -> Self {
self.embedding = Some(embedding);
/// Sets the `embedding` field with the provided embedding.
///
/// # Arguments
///
/// * `embedding` - A reference to an `Embedding` instance to be cloned and set in the struct.
///
/// # Returns
///
/// * `Self` - The updated struct instance with the new embedding set.
pub fn with_embedding(mut self, embedding: &Embedding) -> Self {
self.embedding = Some(embedding.clone());
self
}
/// Sets the `keypoints` field with the provided nested vector of keypoints.
///
/// # Arguments
///
/// * `keypoints` - A slice of vectors of `Keypoint` to be set.
///
/// # Returns
///
/// * `Self` - The updated struct instance with the new keypoints set.
pub fn with_keypoints(mut self, keypoints: &[Vec<Keypoint>]) -> Self {
self.keypoints = Some(keypoints.to_vec());
self
}
/// Sets the `polygons` field with the provided vector of polygons.
///
/// # Arguments
///
/// * `polygons` - A slice of `Polygon` to be set.
///
/// # Returns
///
/// * `Self` - The updated struct instance with the new polygons set.
pub fn with_polygons(mut self, polygons: &[Polygon]) -> Self {
self.polygons = Some(polygons.to_vec());
self
}
/// Returns a reference to the `masks` field, if it exists.
///
/// # Returns
///
/// * `Option<&Vec<Mask>>` - A reference to the vector of masks, or `None` if it is not set.
pub fn masks(&self) -> Option<&Vec<Mask>> {
self.masks.as_ref()
}
/// Returns a reference to the `probs` field, if it exists.
///
/// # Returns
///
/// * `Option<&Prob>` - A reference to the probabilities, or `None` if it is not set.
pub fn probs(&self) -> Option<&Prob> {
self.probs.as_ref()
}
/// Returns a reference to the `keypoints` field, if it exists.
///
/// # Returns
///
/// * `Option<&Vec<Vec<Keypoint>>>` - A reference to the nested vector of keypoints, or `None` if it is not set.
pub fn keypoints(&self) -> Option<&Vec<Vec<Keypoint>>> {
self.keypoints.as_ref()
}
/// Returns a reference to the `polygons` field, if it exists.
///
/// # Returns
///
/// * `Option<&Vec<Polygon>>` - A reference to the vector of polygons, or `None` if it is not set.
pub fn polygons(&self) -> Option<&Vec<Polygon>> {
self.polygons.as_ref()
}
/// Returns a reference to the `bboxes` field, if it exists.
///
/// # Returns
///
/// * `Option<&Vec<Bbox>>` - A reference to the vector of bounding boxes, or `None` if it is not set.
pub fn bboxes(&self) -> Option<&Vec<Bbox>> {
self.bboxes.as_ref()
}
/// Returns a reference to the `mbrs` field, if it exists.
///
/// # Returns
///
/// * `Option<&Vec<Mbr>>` - A reference to the vector of minimum bounding rectangles, or `None` if it is not set.
pub fn mbrs(&self) -> Option<&Vec<Mbr>> {
self.mbrs.as_ref()
}
/// Returns a reference to the `texts` field, if it exists.
///
/// # Returns
///
/// * `Option<&Vec<String>>` - A reference to the vector of texts, or `None` if it is not set.
pub fn texts(&self) -> Option<&Vec<String>> {
self.texts.as_ref()
}
/// Returns a reference to the `embedding` field, if it exists.
///
/// # Returns
///
/// * `Option<&Embedding>` - A reference to the embedding, or `None` if it is not set.
pub fn embedding(&self) -> Option<&Embedding> {
self.embedding.as_ref()
}
pub fn apply_bboxes_nms(mut self, iou_threshold: f32) -> Self {
pub fn apply_nms(mut self, iou_threshold: f32) -> Self {
match &mut self.bboxes {
None => self,
Some(ref mut bboxes) => {
Self::nms_bboxes(bboxes, iou_threshold);
self
}
}
}
pub fn apply_mbrs_nms(mut self, iou_threshold: f32) -> Self {
match &mut self.mbrs {
None => self,
Some(ref mut mbrs) => {
mbrs.sort_by(|b1, b2| {
b2.confidence()
.partial_cmp(&b1.confidence())
.unwrap_or(std::cmp::Ordering::Equal)
});
let mut current_index = 0;
for index in 0..mbrs.len() {
let mut drop = false;
for prev_index in 0..current_index {
let iou = mbrs[prev_index].iou(&mbrs[index]);
if iou > iou_threshold {
drop = true;
break;
}
}
if !drop {
mbrs.swap(current_index, index);
current_index += 1;
}
None => match &mut self.mbrs {
None => self,
Some(ref mut mbrs) => {
Self::nms(mbrs, iou_threshold);
self
}
mbrs.truncate(current_index);
},
Some(ref mut bboxes) => {
Self::nms(bboxes, iou_threshold);
self
}
}
}
pub fn nms_bboxes(bboxes: &mut Vec<Bbox>, iou_threshold: f32) {
bboxes.sort_by(|b1, b2| {
pub fn nms<T: Nms>(xxx: &mut Vec<T>, iou_threshold: f32) {
xxx.sort_by(|b1, b2| {
b2.confidence()
.partial_cmp(&b1.confidence())
.unwrap_or(std::cmp::Ordering::Equal)
});
let mut current_index = 0;
for index in 0..bboxes.len() {
for index in 0..xxx.len() {
let mut drop = false;
for prev_index in 0..current_index {
let iou = bboxes[prev_index].iou(&bboxes[index]);
let iou = xxx[prev_index].iou(&xxx[index]);
if iou > iou_threshold {
drop = true;
break;
}
}
if !drop {
bboxes.swap(current_index, index);
xxx.swap(current_index, index);
current_index += 1;
}
}
bboxes.truncate(current_index);
xxx.truncate(current_index);
}
}