Add YOLOv8-OBB and some bug fixes (#9)
* Add YOLOv8-Obb & Refactor outputs * Update README.md
@ -40,3 +40,4 @@ indicatif = "0.17.8"
|
||||
image = "0.25.1"
|
||||
imageproc = { version = "0.24" }
|
||||
ab_glyph = "0.2.23"
|
||||
geo = "0.28.0"
|
||||
|
85
README.md
@ -1,42 +1,65 @@
|
||||
# usls
|
||||
|
||||
A Rust library integrated with **ONNXRuntime**, providing a collection of **Computer Vison** and **Vision-Language** models including [YOLOv8](https://github.com/ultralytics/ultralytics), [YOLOv9](https://github.com/WongKinYiu/yolov9), [RTDETR](https://arxiv.org/abs/2304.08069), [CLIP](https://github.com/openai/CLIP), [DINOv2](https://github.com/facebookresearch/dinov2), [FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM), [YOLO-World](https://github.com/AILab-CVC/YOLO-World), [BLIP](https://arxiv.org/abs/2201.12086), [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR) and others.
|
||||
A Rust library integrated with **ONNXRuntime**, providing a collection of **Computer Vison** and **Vision-Language** models including [YOLOv5](https://github.com/ultralytics/yolov5), [YOLOv8](https://github.com/ultralytics/ultralytics), [YOLOv9](https://github.com/WongKinYiu/yolov9), [RTDETR](https://arxiv.org/abs/2304.08069), [CLIP](https://github.com/openai/CLIP), [DINOv2](https://github.com/facebookresearch/dinov2), [FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM), [YOLO-World](https://github.com/AILab-CVC/YOLO-World), [BLIP](https://arxiv.org/abs/2201.12086), [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR) and others.
|
||||
|
||||
## Recently Updated
|
||||
|
||||
| YOLOP-v2 | Face-Parsing | Text-Detection |
|
||||
| :----------------------------: | :------------------------------: | :------------------------------: |
|
||||
|<img src='examples/yolop/demo.png' height="240px">| <img src='examples/face-parsing/demo.png' height="240px"> | <img src='examples/db/demo.png' height="240px"> |
|
||||
|
||||
|
||||
| YOLOv8-Obb |
|
||||
| :----------------------------: |
|
||||
|<img src='examples/yolov8/demo-obb-2.png' width="800px">|
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
## Supported Models
|
||||
|
||||
| Model | Task / Type | Example | CUDA<br />f32 | CUDA<br />f16 | TensorRT<br />f32 | TensorRT<br />f16 |
|
||||
| :---------------------------------------------------------------: | :------------------------------------------------------------------------: | :----------------------: | :-----------: | :-----------: | :------------------------: | :-----------------------: |
|
||||
| [YOLOv8-detection](https://github.com/ultralytics/ultralytics) | Object Detection | [demo](examples/yolov8) | ✅ | ✅ | ✅ | ✅ |
|
||||
| [YOLOv8-pose](https://github.com/ultralytics/ultralytics) | Keypoint Detection | [demo](examples/yolov8) | ✅ | ✅ | ✅ | ✅ |
|
||||
| [YOLOv8-classification](https://github.com/ultralytics/ultralytics) | Classification | [demo](examples/yolov8) | ✅ | ✅ | ✅ | ✅ |
|
||||
| [YOLOv8-segmentation](https://github.com/ultralytics/ultralytics) | Instance Segmentation | [demo](examples/yolov8) | ✅ | ✅ | ✅ | ✅ |
|
||||
| [YOLOv9](https://github.com/WongKinYiu/yolov9) | Object Detection | [demo](examples/yolov9) | ✅ | ✅ | ✅ | ✅ |
|
||||
| [RT-DETR](https://arxiv.org/abs/2304.08069) | Object Detection | [demo](examples/rtdetr) | ✅ | ✅ | ✅ | ✅ |
|
||||
| [FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM) | Instance Segmentation | [demo](examples/fastsam) | ✅ | ✅ | ✅ | ✅ |
|
||||
| [YOLO-World](https://github.com/AILab-CVC/YOLO-World) | Object Detection | [demo](examples/yolo-world) | ✅ | ✅ | ✅ | ✅ |
|
||||
| [DINOv2](https://github.com/facebookresearch/dinov2) | Vision-Self-Supervised | [demo](examples/dinov2) | ✅ | ✅ | ✅ | ✅ |
|
||||
| [CLIP](https://github.com/openai/CLIP) | Vision-Language | [demo](examples/clip) | ✅ | ✅ | ✅ visual<br />❌ textual | ✅ visual<br />❌ textual |
|
||||
| [BLIP](https://github.com/salesforce/BLIP) | Vision-Language | [demo](examples/blip) | ✅ | ✅ | ✅ visual<br />❌ textual | ✅ visual<br />❌ textual |
|
||||
| [DB](https://arxiv.org/abs/1911.08947) | Text Detection | [demo](examples/db) | ✅ | ✅ | ✅ | ✅ |
|
||||
| [SVTR](https://arxiv.org/abs/2205.00159) | Text Recognition | [demo](examples/svtr) | ✅ | ✅ | ✅ | ✅ |
|
||||
| [RTMO](https://github.com/open-mmlab/mmpose/tree/main/projects/rtmo) | Keypoint Detection | [demo](examples/rtmo) | ✅ | ✅ | ❌ | ❌ |
|
||||
| [YOLOPv2](https://arxiv.org/abs/2208.11434) | Panoptic driving Perception | [demo](examples/yolop) | ✅ | ✅ | ✅ | ✅ |
|
||||
| Model | Task / Type | Example | CUDA<br />f32 | CUDA<br />f16 | TensorRT<br />f32 | TensorRT<br />f16 |
|
||||
| :---------------------------------------------------------------: | :-------------------------: | :----------------------: | :-----------: | :-----------: | :------------------------: | :-----------------------: |
|
||||
| [YOLOv8-obb](https://github.com/ultralytics/ultralytics) | Oriented Object Detection | [demo](examples/yolov8) | ✅ | ✅ | ✅ | ✅ |
|
||||
| [YOLOv8-detection](https://github.com/ultralytics/ultralytics) | Object Detection | [demo](examples/yolov8) | ✅ | ✅ | ✅ | ✅ |
|
||||
| [YOLOv8-pose](https://github.com/ultralytics/ultralytics) | Keypoint Detection | [demo](examples/yolov8) | ✅ | ✅ | ✅ | ✅ |
|
||||
| [YOLOv8-classification](https://github.com/ultralytics/ultralytics) | Classification | [demo](examples/yolov8) | ✅ | ✅ | ✅ | ✅ |
|
||||
| [YOLOv8-segmentation](https://github.com/ultralytics/ultralytics) | Instance Segmentation | [demo](examples/yolov8) | ✅ | ✅ | ✅ | ✅ |
|
||||
| [YOLOv9](https://github.com/WongKinYiu/yolov9) | Object Detection | [demo](examples/yolov9) | ✅ | ✅ | ✅ | ✅ |
|
||||
| [RT-DETR](https://arxiv.org/abs/2304.08069) | Object Detection | [demo](examples/rtdetr) | ✅ | ✅ | ✅ | ✅ |
|
||||
| [FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM) | Instance Segmentation | [demo](examples/fastsam) | ✅ | ✅ | ✅ | ✅ |
|
||||
| [YOLO-World](https://github.com/AILab-CVC/YOLO-World) | Object Detection | [demo](examples/yolo-world) | ✅ | ✅ | ✅ | ✅ |
|
||||
| [DINOv2](https://github.com/facebookresearch/dinov2) | Vision-Self-Supervised | [demo](examples/dinov2) | ✅ | ✅ | ✅ | ✅ |
|
||||
| [CLIP](https://github.com/openai/CLIP) | Vision-Language | [demo](examples/clip) | ✅ | ✅ | ✅ visual<br />❌ textual | ✅ visual<br />❌ textual |
|
||||
| [BLIP](https://github.com/salesforce/BLIP) | Vision-Language | [demo](examples/blip) | ✅ | ✅ | ✅ visual<br />❌ textual | ✅ visual<br />❌ textual |
|
||||
| [DB](https://arxiv.org/abs/1911.08947) | Text Detection | [demo](examples/db) | ✅ | ✅ | ✅ | ✅ |
|
||||
| [SVTR](https://arxiv.org/abs/2205.00159) | Text Recognition | [demo](examples/svtr) | ✅ | ✅ | ✅ | ✅ |
|
||||
| [RTMO](https://github.com/open-mmlab/mmpose/tree/main/projects/rtmo) | Keypoint Detection | [demo](examples/rtmo) | ✅ | ✅ | ❌ | ❌ |
|
||||
| [YOLOPv2](https://arxiv.org/abs/2208.11434) | Panoptic driving Perception | [demo](examples/yolop) | ✅ | ✅ | ✅ | ✅ |
|
||||
| [YOLOv5-classification](https://github.com/ultralytics/yolov5) | Object Detection | [demo](examples/yolov5) | ✅ | ✅ | ✅ | ✅ |
|
||||
| [YOLOv5-segmentation](https://github.com/ultralytics/yolov5) | Instance Segmentation | [demo](examples/yolov5) | ✅ | ✅ | ✅ | ✅ |
|
||||
|
||||
## Solution Models
|
||||
|
||||
Additionally, this repo also provides some solution models.
|
||||
|
||||
| Model | Example | Result |
|
||||
| :------------------------------------------------------------: | :------------------------------: | :------------------------------: |
|
||||
| Lane Line Segmentation<br /> Drivable Area Segmentation<br />Car Detection<br />车道线-可行驶区域-车辆检测 | [demo](examples/yolov8-plastic-bag) |<img src='examples/yolop/demo.png' width="220px" height="140px">|
|
||||
| Face Parsing<br /> 人脸解析 | [demo](examples/face-parsing) |<img src='examples/face-parsing/demo.png' width="220px" height="200px"> |
|
||||
| Text Detection<br />(PPOCR-det v3, v4)<br />通用文本检测 | [demo](examples/db) |<img src='examples/db/demo.jpg' width="250px" height="200px">|
|
||||
| Text Recognition<br />(PPOCR-rec v3, v4)<br />中英文-文本识别 | [demo](examples/svtr) ||
|
||||
| Face-Landmark Detection<br />人脸 & 关键点检测 | [demo](examples/yolov8-face) |<img src='examples/yolov8-face/demo.jpg' width="220px" height="180px">|
|
||||
| Head Detection<br /> 人头检测 | [demo](examples/yolov8-head) |<img src='examples/yolov8-head/demo.jpg' width="220px" height="180px">|
|
||||
| Fall Detection<br /> 摔倒检测 | [demo](examples/yolov8-falldown) | <img src='examples/yolov8-falldown/demo.jpg' width="220px" height="180px">|
|
||||
| Trash Detection<br /> 垃圾检测 | [demo](examples/yolov8-plastic-bag) |<img src='examples/yolov8-trash/demo.jpg' width="250px" height="180px">|
|
||||
<details close>
|
||||
<summary>Additionally, this repo also provides some solution models.</summary>
|
||||
|
||||
| Model | Example | Result |
|
||||
| :---------------------------------------------------------------------------------------------------------: | :------------------------------: | :-----------------------------------------------------------------------------: |
|
||||
| Lane Line Segmentation<br /> Drivable Area Segmentation<br />Car Detection<br />车道线-可行驶区域-车辆检测 | [demo](examples/yolov8-plastic-bag) | <img src='examples/yolop/demo.png' width="220px" height="140px"> |
|
||||
| Face Parsing<br /> 人脸解析 | [demo](examples/face-parsing) | <img src='examples/face-parsing/demo.png' width="220px" height="200px"> |
|
||||
| Text Detection<br />(PPOCR-det v3, v4)<br />通用文本检测 | [demo](examples/db) | <img src='examples/db/demo.png' width="250px" height="200px"> |
|
||||
| Text Recognition<br />(PPOCR-rec v3, v4)<br />中英文-文本识别 | [demo](examples/svtr) | |
|
||||
| Face-Landmark Detection<br />人脸 & 关键点检测 | [demo](examples/yolov8-face) | <img src='examples/yolov8-face/demo.png' width="220px" height="180px"> |
|
||||
| Head Detection<br /> 人头检测 | [demo](examples/yolov8-head) | <img src='examples/yolov8-head/demo.png' width="220px" height="180px"> |
|
||||
| Fall Detection<br /> 摔倒检测 | [demo](examples/yolov8-falldown) | <img src='examples/yolov8-falldown/demo.png' width="220px" height="180px"> |
|
||||
| Trash Detection<br /> 垃圾检测 | [demo](examples/yolov8-plastic-bag) | <img src='examples/yolov8-trash/demo.png' width="250px" height="180px"> |
|
||||
|
||||
</details>
|
||||
|
||||
## Demo
|
||||
|
||||
@ -59,8 +82,9 @@ check **[ort guide](https://ort.pyke.io/setup/linking)**
|
||||
|
||||
</details>
|
||||
|
||||
|
||||
## Integrate into your own project
|
||||
<details close>
|
||||
<summary>Check Here</summary>
|
||||
|
||||
#### 1. Add `usls` as a dependency to your project's `Cargo.toml`
|
||||
|
||||
@ -126,3 +150,4 @@ let y = model.run(&x)?;
|
||||
let annotator = Annotator::default().with_saveout("YOLOv8");
|
||||
annotator.annotate(&x, &y);
|
||||
```
|
||||
</details>
|
||||
|
BIN
assets/2.jpg
Normal file
After Width: | Height: | Size: 176 KiB |
BIN
assets/dota.png
Normal file
After Width: | Height: | Size: 680 KiB |
@ -17,10 +17,12 @@ cargo run -r --example blip
|
||||
```shell
|
||||
[Unconditional image captioning]: a group of people walking around a bus
|
||||
[Conditional image captioning]: three man walking in front of a bus
|
||||
Some(["three man walking in front of a bus"])
|
||||
```
|
||||
|
||||
## TODO
|
||||
|
||||
* [ ] Multi-batch inference for image caption
|
||||
* [ ] VQA
|
||||
* [ ] Retrival
|
||||
* [ ] TensorRT support for textual model
|
||||
|
@ -1,4 +1,4 @@
|
||||
use usls::{models::Blip, Options};
|
||||
use usls::{models::Blip, DataLoader, Options};
|
||||
|
||||
fn main() -> Result<(), Box<dyn std::error::Error>> {
|
||||
// visual
|
||||
@ -22,9 +22,11 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
|
||||
// build model
|
||||
let mut model = Blip::new(options_visual, options_textual)?;
|
||||
|
||||
// image caption
|
||||
model.caption("./assets/bus.jpg", None)?; // unconditional
|
||||
model.caption("./assets/bus.jpg", Some("three man"))?; // conditional
|
||||
// image caption (this demo use batch_size=1)
|
||||
let x = vec![DataLoader::try_read("./assets/bus.jpg")?];
|
||||
let _y = model.caption(&x, None, true)?; // unconditional
|
||||
let y = model.caption(&x, Some("three man"), true)?; // conditional
|
||||
println!("{:?}", y[0].texts());
|
||||
|
||||
Ok(())
|
||||
}
|
||||
|
@ -1,4 +1,4 @@
|
||||
use usls::{models::Clip, ops, DataLoader, Options};
|
||||
use usls::{models::Clip, DataLoader, Options};
|
||||
|
||||
fn main() -> Result<(), Box<dyn std::error::Error>> {
|
||||
// visual
|
||||
@ -39,7 +39,7 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
|
||||
let feats_image = model.encode_images(&images).unwrap();
|
||||
|
||||
// use image to query texts
|
||||
let matrix = ops::dot2(&feats_image, &feats_text)?; // [m, n]
|
||||
let matrix = feats_image.dot2(&feats_text)?;
|
||||
|
||||
// summary
|
||||
for i in 0..paths.len() {
|
||||
|
@ -20,4 +20,4 @@ cargo run -r --example db
|
||||
|
||||
## Results
|
||||
|
||||

|
||||

|
||||
|
Before Width: | Height: | Size: 165 KiB |
BIN
examples/db/demo.png
Normal file
After Width: | Height: | Size: 35 KiB |
@ -15,18 +15,21 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
|
||||
let mut model = DB::new(&options)?;
|
||||
|
||||
// load image
|
||||
let x = vec![DataLoader::try_read("./assets/db.png")?];
|
||||
let x = vec![
|
||||
DataLoader::try_read("./assets/db.png")?,
|
||||
// DataLoader::try_read("./assets/2.jpg")?,
|
||||
];
|
||||
|
||||
// run
|
||||
let y = model.run(&x)?;
|
||||
|
||||
// annotate
|
||||
let annotator = Annotator::default()
|
||||
.without_name(true)
|
||||
.without_polygons(false)
|
||||
.with_mask_alpha(0)
|
||||
.without_bboxes(false)
|
||||
.with_saveout("DB-Text-Detection");
|
||||
.without_bboxes(true)
|
||||
.with_masks_alpha(60)
|
||||
.with_polygon_color([255, 105, 180, 255])
|
||||
.without_mbrs(true)
|
||||
.with_saveout("DB");
|
||||
annotator.annotate(&x, &y);
|
||||
|
||||
Ok(())
|
||||
|
Before Width: | Height: | Size: 448 KiB After Width: | Height: | Size: 105 KiB |
@ -9,7 +9,6 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
|
||||
.with_i03((416, 640, 800).into())
|
||||
// .with_trt(0)
|
||||
// .with_fp16(true)
|
||||
// .with_dry_run(10)
|
||||
.with_confs(&[0.5]);
|
||||
let mut model = YOLO::new(&options)?;
|
||||
|
||||
@ -21,10 +20,10 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
|
||||
|
||||
// annotate
|
||||
let annotator = Annotator::default()
|
||||
.without_conf(true)
|
||||
.without_name(true)
|
||||
.without_polygons(false)
|
||||
.without_bboxes(true)
|
||||
.without_bboxes_conf(true)
|
||||
.without_bboxes_name(true)
|
||||
.without_polygons(false)
|
||||
.with_masks_name(false)
|
||||
.with_saveout("Face-Parsing");
|
||||
annotator.annotate(&x, &y);
|
||||
|
@ -20,4 +20,4 @@ cargo run -r --example fastsam
|
||||
|
||||
## Results
|
||||
|
||||

|
||||

|
||||
|
Before Width: | Height: | Size: 302 KiB |
BIN
examples/fastsam/demo.png
Normal file
After Width: | Height: | Size: 321 KiB |
@ -18,4 +18,4 @@ cargo run -r --example rtdetr
|
||||
|
||||
## Results
|
||||
|
||||

|
||||

|
||||
|
Before Width: | Height: | Size: 258 KiB |
BIN
examples/rtdetr/demo.png
Normal file
After Width: | Height: | Size: 439 KiB |
@ -1,11 +1,11 @@
|
||||
use usls::{models::RTDETR, Annotator, DataLoader, Options, COCO_NAMES_80};
|
||||
use usls::{coco, models::RTDETR, Annotator, DataLoader, Options};
|
||||
|
||||
fn main() -> Result<(), Box<dyn std::error::Error>> {
|
||||
// build model
|
||||
let options = Options::default()
|
||||
.with_model("../models/rtdetr-l-f16.onnx")
|
||||
.with_confs(&[0.4, 0.15]) // person: 0.4, others: 0.15
|
||||
.with_names(&COCO_NAMES_80);
|
||||
.with_names(&coco::NAMES_80);
|
||||
let mut model = RTDETR::new(&options)?;
|
||||
|
||||
// load image
|
||||
|
@ -15,4 +15,4 @@ cargo run -r --example rtmo
|
||||
|
||||
## Results
|
||||
|
||||

|
||||

|
||||
|
Before Width: | Height: | Size: 242 KiB |
BIN
examples/rtmo/demo.png
Normal file
After Width: | Height: | Size: 455 KiB |
@ -1,10 +1,10 @@
|
||||
use usls::{models::RTMO, Annotator, DataLoader, Options, COCO_SKELETON_17};
|
||||
use usls::{coco, models::RTMO, Annotator, DataLoader, Options};
|
||||
|
||||
fn main() -> Result<(), Box<dyn std::error::Error>> {
|
||||
// build model
|
||||
let options = Options::default()
|
||||
.with_model("../rtmo-l-dyn-f16.onnx")
|
||||
.with_i00((1, 2, 8).into())
|
||||
.with_model("../rtmo-s-dyn.onnx")
|
||||
.with_i00((1, 1, 8).into())
|
||||
.with_nk(17)
|
||||
.with_confs(&[0.3])
|
||||
.with_kconfs(&[0.5]);
|
||||
@ -19,7 +19,7 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
|
||||
// // annotate
|
||||
let annotator = Annotator::default()
|
||||
.with_saveout("RTMO")
|
||||
.with_skeletons(&COCO_SKELETON_17);
|
||||
.with_skeletons(&coco::SKELETONS_16);
|
||||
annotator.annotate(&x, &y);
|
||||
|
||||
Ok(())
|
||||
|
@ -24,9 +24,13 @@ cargo run -r --example svtr
|
||||
## Results
|
||||
|
||||
```shell
|
||||
[Texts] from the background, but also separate text instances which
|
||||
[Texts] are closely jointed. Some examples are illustrated in Fig.7.
|
||||
[Texts] 你有这么高速运转的机械进入中国,记住我给出的原理
|
||||
[Texts] 110022345
|
||||
[Texts] 冀B6G000
|
||||
```
|
||||
["./examples/svtr/images/5.png"]: Some(["are closely jointed. Some examples are illustrated in Fig.7."])
|
||||
["./examples/svtr/images/6.png"]: Some(["小菊儿胡同71号"])
|
||||
["./examples/svtr/images/4.png"]: Some(["我在南锣鼓捣猫呢"])
|
||||
["./examples/svtr/images/1.png"]: Some(["你有这么高速运转的机械进入中国,记住我给出的原理"])
|
||||
["./examples/svtr/images/2.png"]: Some(["冀B6G000"])
|
||||
["./examples/svtr/images/9.png"]: Some(["from the background, but also separate text instances which"])
|
||||
["./examples/svtr/images/8.png"]: Some(["110022345"])
|
||||
["./examples/svtr/images/3.png"]: Some(["粤A·68688"])
|
||||
["./examples/svtr/images/7.png"]: Some(["Please lower your volume"])
|
||||
```
|
Before Width: | Height: | Size: 14 KiB After Width: | Height: | Size: 14 KiB |
BIN
examples/svtr/images/2.png
Normal file
After Width: | Height: | Size: 13 KiB |
BIN
examples/svtr/images/3.png
Normal file
After Width: | Height: | Size: 59 KiB |
BIN
examples/svtr/images/4.png
Normal file
After Width: | Height: | Size: 15 KiB |
Before Width: | Height: | Size: 17 KiB After Width: | Height: | Size: 17 KiB |
BIN
examples/svtr/images/6.png
Normal file
After Width: | Height: | Size: 10 KiB |
BIN
examples/svtr/images/7.png
Normal file
After Width: | Height: | Size: 13 KiB |
Before Width: | Height: | Size: 24 KiB After Width: | Height: | Size: 24 KiB |
Before Width: | Height: | Size: 9.0 KiB After Width: | Height: | Size: 9.0 KiB |
@ -5,23 +5,20 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
|
||||
let options = Options::default()
|
||||
.with_i00((1, 2, 8).into())
|
||||
.with_i03((320, 960, 1600).into())
|
||||
.with_confs(&[0.4])
|
||||
.with_confs(&[0.2])
|
||||
.with_vocab("../ppocr_rec_vocab.txt")
|
||||
.with_model("../models/ppocr-v4-svtr-ch-dyn.onnx");
|
||||
let mut model = SVTR::new(&options)?;
|
||||
|
||||
// load image
|
||||
let xs = vec![
|
||||
DataLoader::try_read("./examples/svtr/text1.png")?,
|
||||
DataLoader::try_read("./examples/svtr/text2.png")?,
|
||||
DataLoader::try_read("./examples/svtr/text3.png")?,
|
||||
DataLoader::try_read("./examples/svtr/text4.png")?,
|
||||
DataLoader::try_read("./examples/svtr/text5.png")?,
|
||||
];
|
||||
// load images
|
||||
let dl = DataLoader::default()
|
||||
.with_batch(1)
|
||||
.load("./examples/svtr/images")?;
|
||||
|
||||
// run
|
||||
for text in model.run(&xs)?.into_iter() {
|
||||
println!("[Texts] {text}")
|
||||
for (xs, paths) in dl {
|
||||
let ys = model.run(&xs)?;
|
||||
println!("{paths:?}: {:?}", ys[0].texts())
|
||||
}
|
||||
|
||||
Ok(())
|
||||
|
Before Width: | Height: | Size: 14 KiB |
@ -40,4 +40,4 @@ cargo run -r --example yolo-world
|
||||
|
||||
## Results
|
||||
|
||||

|
||||

|
||||
|
Before Width: | Height: | Size: 216 KiB |
BIN
examples/yolo-world/demo.png
Normal file
After Width: | Height: | Size: 453 KiB |
Before Width: | Height: | Size: 922 KiB After Width: | Height: | Size: 296 KiB |
@ -5,8 +5,6 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
|
||||
let options = Options::default()
|
||||
.with_model("../models/yolopv2-dyn-480x800.onnx")
|
||||
.with_i00((1, 1, 8).into())
|
||||
// .with_trt(0)
|
||||
// .with_fp16(true)
|
||||
.with_confs(&[0.3]);
|
||||
let mut model = YOLOPv2::new(&options)?;
|
||||
|
||||
@ -18,7 +16,7 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
|
||||
|
||||
// annotate
|
||||
let annotator = Annotator::default()
|
||||
.with_masks_name(false)
|
||||
.with_masks_name(true)
|
||||
.with_saveout("YOLOPv2");
|
||||
annotator.annotate(&x, &y);
|
||||
|
||||
|
BIN
examples/yolov5/demo.png
Normal file
After Width: | Height: | Size: 395 KiB |
32
examples/yolov5/main.rs
Normal file
@ -0,0 +1,32 @@
|
||||
use usls::{
|
||||
models::{YOLOTask, YOLO},
|
||||
Annotator, DataLoader, Options,
|
||||
};
|
||||
|
||||
fn main() -> Result<(), Box<dyn std::error::Error>> {
|
||||
// build model
|
||||
let options = Options::default()
|
||||
.with_conf_independent(true)
|
||||
.with_anchors_first(true)
|
||||
.with_yolo_task(YOLOTask::Segment)
|
||||
.with_model("../models/yolov5s-seg.onnx")
|
||||
.with_trt(0)
|
||||
.with_fp16(true)
|
||||
.with_i00((1, 1, 4).into())
|
||||
.with_i02((224, 640, 800).into())
|
||||
.with_i03((224, 640, 800).into())
|
||||
.with_dry_run(3);
|
||||
let mut model = YOLO::new(&options)?;
|
||||
|
||||
// load image
|
||||
let x = vec![DataLoader::try_read("./assets/bus.jpg")?];
|
||||
|
||||
// run
|
||||
let y = model.run(&x)?;
|
||||
|
||||
// annotate
|
||||
let annotator = Annotator::default().with_saveout("YOLOv5");
|
||||
annotator.annotate(&x, &y);
|
||||
|
||||
Ok(())
|
||||
}
|
@ -10,4 +10,4 @@ cargo run -r --example yolov8-face
|
||||
|
||||
## Results
|
||||
|
||||

|
||||

|
||||
|
Before Width: | Height: | Size: 129 KiB |
BIN
examples/yolov8-face/demo.png
Normal file
After Width: | Height: | Size: 285 KiB |
@ -11,4 +11,4 @@ cargo run -r --example yolov8-falldown
|
||||
|
||||
## Results
|
||||
|
||||

|
||||

|
||||
|
Before Width: | Height: | Size: 37 KiB |
BIN
examples/yolov8-falldown/demo.png
Normal file
After Width: | Height: | Size: 57 KiB |
@ -2,9 +2,7 @@ use usls::{models::YOLO, Annotator, DataLoader, Options};
|
||||
|
||||
fn main() -> Result<(), Box<dyn std::error::Error>> {
|
||||
// build model
|
||||
let options = Options::default()
|
||||
.with_model("../models/yolov8-falldown-f16.onnx")
|
||||
.with_confs(&[0.3]);
|
||||
let options = Options::default().with_model("../models/yolov8-falldown-f16.onnx");
|
||||
let mut model = YOLO::new(&options)?;
|
||||
|
||||
// load image
|
||||
|
@ -11,4 +11,4 @@ cargo run -r --example yolov8-head
|
||||
|
||||
## Results
|
||||
|
||||

|
||||

|
||||
|
Before Width: | Height: | Size: 134 KiB |
BIN
examples/yolov8-head/demo.png
Normal file
After Width: | Height: | Size: 291 KiB |
@ -2,9 +2,7 @@ use usls::{models::YOLO, Annotator, DataLoader, Options};
|
||||
|
||||
fn main() -> Result<(), Box<dyn std::error::Error>> {
|
||||
// build model
|
||||
let options = Options::default()
|
||||
.with_model("../models/yolov8-head-f16.onnx")
|
||||
.with_confs(&[0.3]);
|
||||
let options = Options::default().with_model("../models/yolov8-head-f16.onnx");
|
||||
let mut model = YOLO::new(&options)?;
|
||||
|
||||
// load image
|
||||
|
@ -13,4 +13,4 @@ cargo run -r --example yolov8-trash
|
||||
|
||||
## Results
|
||||
|
||||

|
||||

|
||||
|
Before Width: | Height: | Size: 214 KiB |
BIN
examples/yolov8-trash/demo.png
Normal file
After Width: | Height: | Size: 367 KiB |
@ -4,7 +4,6 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
|
||||
// 1.build model
|
||||
let options = Options::default()
|
||||
.with_model("../models/yolov8-plastic-bag-f16.onnx")
|
||||
.with_confs(&[0.3])
|
||||
.with_names(&["trash"]);
|
||||
let mut model = YOLO::new(&options)?;
|
||||
|
||||
|
@ -14,19 +14,22 @@ yolo export model=yolov8m.pt format=onnx simplify dynamic
|
||||
yolo export model=yolov8m-cls.pt format=onnx simplify dynamic
|
||||
yolo export model=yolov8m-pose.pt format=onnx simplify dynamic
|
||||
yolo export model=yolov8m-seg.pt format=onnx simplify dynamic
|
||||
yolo export model=yolov8m-obb.pt format=onnx simplify dynamic
|
||||
|
||||
# export onnx model with fixed shapes
|
||||
yolo export model=yolov8m.pt format=onnx simplify
|
||||
yolo export model=yolov8m-cls.pt format=onnx simplify
|
||||
yolo export model=yolov8m-pose.pt format=onnx simplify
|
||||
yolo export model=yolov8m-seg.pt format=onnx simplify
|
||||
yolo export model=yolov8m-obb.pt format=onnx simplify
|
||||
```
|
||||
|
||||
## Result
|
||||
|
||||
| Task | Annotated image |
|
||||
| :-------------------: | --------------------- |
|
||||
| Obb |  |
|
||||
| Instance Segmentation |  |
|
||||
| Classification |  |
|
||||
| Classification |  |
|
||||
| Detection |  |
|
||||
| Pose |  |
|
||||
|
Before Width: | Height: | Size: 221 KiB |
BIN
examples/yolov8/demo-cls.png
Normal file
After Width: | Height: | Size: 453 KiB |
Before Width: | Height: | Size: 1.8 MiB After Width: | Height: | Size: 451 KiB |
BIN
examples/yolov8/demo-obb-2.png
Normal file
After Width: | Height: | Size: 546 KiB |
BIN
examples/yolov8/demo-obb.png
Normal file
After Width: | Height: | Size: 552 KiB |
Before Width: | Height: | Size: 1.8 MiB After Width: | Height: | Size: 457 KiB |
Before Width: | Height: | Size: 1.6 MiB After Width: | Height: | Size: 387 KiB |
@ -1,38 +1,70 @@
|
||||
use usls::{
|
||||
models::YOLO, Annotator, DataLoader, Options, COCO_KEYPOINT_NAMES_17, COCO_SKELETON_17,
|
||||
};
|
||||
use usls::{coco, models::YOLO, Annotator, DataLoader, Options};
|
||||
|
||||
fn main() -> Result<(), Box<dyn std::error::Error>> {
|
||||
// build model
|
||||
let options = Options::default()
|
||||
.with_model("../models/yolov8m-dyn-f16.onnx")
|
||||
// .with_trt(0) // cuda by default
|
||||
// .with_model("../models/yolov8m.onnx")
|
||||
// .with_model("../models/yolov8m-dyn-f16.onnx")
|
||||
// .with_model("../models/yolov8m-pose-dyn-f16.onnx")
|
||||
// .with_model("../models/yolov8m-seg-dyn-f16.onnx")
|
||||
.with_model("../models/yolov8s-cls.onnx")
|
||||
// .with_model("../models/yolov8s-obb.onnx")
|
||||
// .with_trt(0)
|
||||
// .with_fp16(true)
|
||||
.with_i00((1, 1, 4).into())
|
||||
.with_i02((224, 640, 800).into())
|
||||
.with_i03((224, 640, 800).into())
|
||||
.with_i02((224, 1024, 1024).into())
|
||||
.with_i03((224, 1024, 1024).into())
|
||||
// .with_i02((224, 640, 800).into())
|
||||
// .with_i03((224, 640, 800).into())
|
||||
.with_confs(&[0.4, 0.15]) // person: 0.4, others: 0.15
|
||||
.with_names2(&COCO_KEYPOINT_NAMES_17)
|
||||
.with_profile(false)
|
||||
.with_dry_run(3);
|
||||
.with_names2(&coco::KEYPOINTS_NAMES_17)
|
||||
.with_profile(true)
|
||||
.with_dry_run(10);
|
||||
let mut model = YOLO::new(&options)?;
|
||||
|
||||
// build dataloader
|
||||
let dl = DataLoader::default()
|
||||
.with_batch(1)
|
||||
.load("./assets/bus.jpg")?;
|
||||
// .load("./assets/dota.png")?;
|
||||
|
||||
// build annotate
|
||||
let annotator = Annotator::default()
|
||||
.with_skeletons(&COCO_SKELETON_17)
|
||||
.without_conf(false)
|
||||
.without_name(false)
|
||||
.with_keypoints_name(false)
|
||||
.with_keypoints_conf(false)
|
||||
.with_masks_name(false)
|
||||
.without_masks(false)
|
||||
.without_polygons(false)
|
||||
.without_bboxes(false)
|
||||
// .with_probs_topk(10)
|
||||
// // bboxes
|
||||
// .without_bboxes(false)
|
||||
// .without_bboxes_conf(false)
|
||||
// .without_bboxes_name(false)
|
||||
// .without_bboxes_text_bg(false)
|
||||
// .with_bboxes_text_color([255, 255, 255, 255])
|
||||
// .with_bboxes_text_bg_alpha(255)
|
||||
// // keypoints
|
||||
// .without_keypoints(false)
|
||||
// .with_keypoints_palette(&COCO_KEYPOINT_COLORS_17)
|
||||
.with_skeletons(&coco::SKELETONS_16)
|
||||
// .with_keypoints_name(false)
|
||||
// .with_keypoints_conf(false)
|
||||
// .without_keypoints_text_bg(false)
|
||||
// .with_keypoints_text_color([255, 255, 255, 255])
|
||||
// .with_keypoints_text_bg_alpha(255)
|
||||
// .with_keypoints_radius(4)
|
||||
// // masks
|
||||
// .without_masks(false)
|
||||
// .with_masks_alpha(190)
|
||||
// .without_polygons(false)
|
||||
// // .with_polygon_color([0, 255, 255, 255])
|
||||
// .with_masks_conf(false)
|
||||
// .with_masks_name(true)
|
||||
// .with_masks_text_bg(true)
|
||||
// .with_masks_text_color([255, 255, 255, 255])
|
||||
// .with_masks_text_bg_alpha(10)
|
||||
// // mbrs
|
||||
// .without_mbrs(false)
|
||||
// .without_mbrs_conf(false)
|
||||
// .without_mbrs_name(false)
|
||||
// .without_mbrs_text_bg(false)
|
||||
// .with_mbrs_text_color([255, 255, 255, 255])
|
||||
// .with_mbrs_text_bg_alpha(70)
|
||||
.with_saveout("YOLOv8");
|
||||
|
||||
// run & annotate
|
||||
|
@ -26,4 +26,4 @@ cargo run -r --example yolov9
|
||||
|
||||
## Results
|
||||
|
||||

|
||||

|
||||
|
Before Width: | Height: | Size: 232 KiB |
BIN
examples/yolov9/demo.png
Normal file
After Width: | Height: | Size: 450 KiB |
@ -7,8 +7,7 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
|
||||
.with_i00((1, 1, 4).into())
|
||||
.with_i02((416, 640, 800).into())
|
||||
.with_i03((416, 640, 800).into())
|
||||
.with_confs(&[0.4, 0.15]) // person: 0.4, others: 0.15
|
||||
.with_profile(false);
|
||||
.with_confs(&[0.4, 0.15]); // person: 0.4, others: 0.15
|
||||
let mut model = YOLO::new(&options)?;
|
||||
|
||||
// load image
|
||||
|
@ -1,64 +1,135 @@
|
||||
use crate::{auto_load, string_now, Bbox, Embedding, Keypoint, Mask, Ys, CHECK_MARK, CROSS_MARK};
|
||||
use crate::{auto_load, string_now, Bbox, Keypoint, Mask, Mbr, Prob, CHECK_MARK, CROSS_MARK, Y};
|
||||
use ab_glyph::{FontVec, PxScale};
|
||||
use anyhow::Result;
|
||||
use image::{DynamicImage, Rgba, RgbaImage};
|
||||
|
||||
/// Annotator for struct `Y`
|
||||
#[derive(Debug)]
|
||||
pub struct Annotator {
|
||||
font: ab_glyph::FontVec,
|
||||
scale_: f32, // Cope with ab_glyph & imageproc=0.24.0
|
||||
skeletons: Option<Vec<(usize, usize)>>,
|
||||
font: FontVec,
|
||||
_scale: f32, // Cope with ab_glyph & imageproc=0.24.0
|
||||
scale_dy: f32,
|
||||
saveout: Option<String>,
|
||||
mask_alpha: u8,
|
||||
polygon_color: Rgba<u8>,
|
||||
without_conf: bool,
|
||||
without_name: bool,
|
||||
// About mbrs
|
||||
without_mbrs: bool,
|
||||
without_mbrs_conf: bool,
|
||||
without_mbrs_name: bool,
|
||||
without_mbrs_text_bg: bool,
|
||||
mbrs_text_color: Rgba<u8>,
|
||||
// About bboxes
|
||||
without_bboxes: bool,
|
||||
without_bboxes_conf: bool,
|
||||
without_bboxes_name: bool,
|
||||
without_bboxes_text_bg: bool,
|
||||
bboxes_text_color: Rgba<u8>,
|
||||
// About keypoints
|
||||
without_keypoints: bool,
|
||||
with_keypoints_conf: bool,
|
||||
with_keypoints_name: bool,
|
||||
with_masks_name: bool,
|
||||
without_bboxes: bool,
|
||||
without_keypoints_text_bg: bool,
|
||||
keypoints_text_color: Rgba<u8>,
|
||||
skeletons: Option<Vec<(usize, usize)>>,
|
||||
keypoints_radius: usize,
|
||||
keypoints_palette: Option<Vec<(u8, u8, u8, u8)>>,
|
||||
// About masks
|
||||
without_masks: bool,
|
||||
without_polygons: bool,
|
||||
without_keypoints: bool,
|
||||
keypoint_radius: usize,
|
||||
with_masks_conf: bool,
|
||||
with_masks_name: bool,
|
||||
with_masks_text_bg: bool,
|
||||
masks_text_color: Rgba<u8>,
|
||||
masks_alpha: u8,
|
||||
polygon_color: Rgba<u8>,
|
||||
// About probs
|
||||
probs_topk: usize,
|
||||
}
|
||||
|
||||
impl Default for Annotator {
|
||||
fn default() -> Self {
|
||||
Self {
|
||||
font: Self::load_font(None).unwrap(),
|
||||
scale_: 6.666667,
|
||||
mask_alpha: 179,
|
||||
polygon_color: Rgba([255, 255, 255, 255]),
|
||||
skeletons: None,
|
||||
_scale: 6.666667,
|
||||
scale_dy: 28.,
|
||||
masks_alpha: 179,
|
||||
saveout: None,
|
||||
without_conf: false,
|
||||
without_name: false,
|
||||
without_bboxes: false,
|
||||
without_bboxes_conf: false,
|
||||
without_bboxes_name: false,
|
||||
bboxes_text_color: Rgba([0, 0, 0, 255]),
|
||||
without_bboxes_text_bg: false,
|
||||
without_mbrs: false,
|
||||
without_mbrs_conf: false,
|
||||
without_mbrs_name: false,
|
||||
without_mbrs_text_bg: false,
|
||||
mbrs_text_color: Rgba([0, 0, 0, 255]),
|
||||
without_keypoints: false,
|
||||
with_keypoints_conf: false,
|
||||
with_keypoints_name: false,
|
||||
with_masks_name: false,
|
||||
without_bboxes: false,
|
||||
keypoints_radius: 3,
|
||||
skeletons: None,
|
||||
keypoints_palette: None,
|
||||
without_keypoints_text_bg: false,
|
||||
keypoints_text_color: Rgba([0, 0, 0, 255]),
|
||||
without_masks: false,
|
||||
without_polygons: false,
|
||||
without_keypoints: false,
|
||||
keypoint_radius: 3,
|
||||
polygon_color: Rgba([255, 255, 255, 255]),
|
||||
with_masks_name: false,
|
||||
with_masks_conf: false,
|
||||
with_masks_text_bg: false,
|
||||
masks_text_color: Rgba([255, 255, 255, 255]),
|
||||
probs_topk: 5usize,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl Annotator {
|
||||
pub fn with_keypoint_radius(mut self, x: usize) -> Self {
|
||||
self.keypoint_radius = x;
|
||||
pub fn without_bboxes(mut self, x: bool) -> Self {
|
||||
self.without_bboxes = x;
|
||||
self
|
||||
}
|
||||
|
||||
pub fn without_conf(mut self, x: bool) -> Self {
|
||||
self.without_conf = x;
|
||||
pub fn without_bboxes_conf(mut self, x: bool) -> Self {
|
||||
self.without_bboxes_conf = x;
|
||||
self
|
||||
}
|
||||
|
||||
pub fn without_name(mut self, x: bool) -> Self {
|
||||
self.without_name = x;
|
||||
pub fn without_bboxes_name(mut self, x: bool) -> Self {
|
||||
self.without_bboxes_name = x;
|
||||
self
|
||||
}
|
||||
|
||||
pub fn without_bboxes_text_bg(mut self, x: bool) -> Self {
|
||||
self.without_bboxes_text_bg = x;
|
||||
self
|
||||
}
|
||||
|
||||
pub fn with_bboxes_text_bg_alpha(mut self, x: u8) -> Self {
|
||||
self.bboxes_text_color.0[3] = x;
|
||||
self
|
||||
}
|
||||
|
||||
pub fn with_bboxes_text_color(mut self, rgba: [u8; 4]) -> Self {
|
||||
self.bboxes_text_color = Rgba(rgba);
|
||||
self
|
||||
}
|
||||
|
||||
pub fn without_keypoints(mut self, x: bool) -> Self {
|
||||
self.without_keypoints = x;
|
||||
self
|
||||
}
|
||||
|
||||
pub fn with_skeletons(mut self, x: &[(usize, usize)]) -> Self {
|
||||
self.skeletons = Some(x.to_vec());
|
||||
self
|
||||
}
|
||||
|
||||
pub fn with_keypoints_palette(mut self, x: &[(u8, u8, u8, u8)]) -> Self {
|
||||
self.keypoints_palette = Some(x.to_vec());
|
||||
self
|
||||
}
|
||||
|
||||
pub fn with_keypoints_radius(mut self, x: usize) -> Self {
|
||||
self.keypoints_radius = x;
|
||||
self
|
||||
}
|
||||
|
||||
@ -72,13 +143,48 @@ impl Annotator {
|
||||
self
|
||||
}
|
||||
|
||||
pub fn with_masks_name(mut self, x: bool) -> Self {
|
||||
self.with_masks_name = x;
|
||||
pub fn with_keypoints_text_color(mut self, rgba: [u8; 4]) -> Self {
|
||||
self.keypoints_text_color = Rgba(rgba);
|
||||
self
|
||||
}
|
||||
|
||||
pub fn without_bboxes(mut self, x: bool) -> Self {
|
||||
self.without_bboxes = x;
|
||||
pub fn without_keypoints_text_bg(mut self, x: bool) -> Self {
|
||||
self.without_keypoints_text_bg = x;
|
||||
self
|
||||
}
|
||||
|
||||
pub fn with_keypoints_text_bg_alpha(mut self, x: u8) -> Self {
|
||||
self.keypoints_text_color.0[3] = x;
|
||||
self
|
||||
}
|
||||
|
||||
pub fn without_mbrs(mut self, x: bool) -> Self {
|
||||
self.without_mbrs = x;
|
||||
self
|
||||
}
|
||||
|
||||
pub fn without_mbrs_conf(mut self, x: bool) -> Self {
|
||||
self.without_mbrs_conf = x;
|
||||
self
|
||||
}
|
||||
|
||||
pub fn without_mbrs_name(mut self, x: bool) -> Self {
|
||||
self.without_mbrs_name = x;
|
||||
self
|
||||
}
|
||||
|
||||
pub fn without_mbrs_text_bg(mut self, x: bool) -> Self {
|
||||
self.without_mbrs_text_bg = x;
|
||||
self
|
||||
}
|
||||
|
||||
pub fn with_mbrs_text_color(mut self, rgba: [u8; 4]) -> Self {
|
||||
self.mbrs_text_color = Rgba(rgba);
|
||||
self
|
||||
}
|
||||
|
||||
pub fn with_mbrs_text_bg_alpha(mut self, x: u8) -> Self {
|
||||
self.mbrs_text_color.0[3] = x;
|
||||
self
|
||||
}
|
||||
|
||||
@ -92,8 +198,33 @@ impl Annotator {
|
||||
self
|
||||
}
|
||||
|
||||
pub fn with_mask_alpha(mut self, x: u8) -> Self {
|
||||
self.mask_alpha = x;
|
||||
pub fn with_masks_conf(mut self, x: bool) -> Self {
|
||||
self.with_masks_conf = x;
|
||||
self
|
||||
}
|
||||
|
||||
pub fn with_masks_name(mut self, x: bool) -> Self {
|
||||
self.with_masks_name = x;
|
||||
self
|
||||
}
|
||||
|
||||
pub fn with_masks_text_bg(mut self, x: bool) -> Self {
|
||||
self.with_masks_text_bg = x;
|
||||
self
|
||||
}
|
||||
|
||||
pub fn with_masks_text_color(mut self, rgba: [u8; 4]) -> Self {
|
||||
self.masks_text_color = Rgba(rgba);
|
||||
self
|
||||
}
|
||||
|
||||
pub fn with_masks_alpha(mut self, x: u8) -> Self {
|
||||
self.masks_alpha = x;
|
||||
self
|
||||
}
|
||||
|
||||
pub fn with_masks_text_bg_alpha(mut self, x: u8) -> Self {
|
||||
self.masks_text_color.0[3] = x;
|
||||
self
|
||||
}
|
||||
|
||||
@ -102,8 +233,8 @@ impl Annotator {
|
||||
self
|
||||
}
|
||||
|
||||
pub fn without_keypoints(mut self, x: bool) -> Self {
|
||||
self.without_keypoints = x;
|
||||
pub fn with_probs_topk(mut self, x: usize) -> Self {
|
||||
self.probs_topk = x;
|
||||
self
|
||||
}
|
||||
|
||||
@ -112,11 +243,6 @@ impl Annotator {
|
||||
self
|
||||
}
|
||||
|
||||
pub fn with_skeletons(mut self, skeletons: &[(usize, usize)]) -> Self {
|
||||
self.skeletons = Some(skeletons.to_vec());
|
||||
self
|
||||
}
|
||||
|
||||
pub fn with_font(mut self, path: &str) -> Self {
|
||||
self.font = Self::load_font(Some(path)).unwrap();
|
||||
self
|
||||
@ -135,36 +261,44 @@ impl Annotator {
|
||||
}
|
||||
}
|
||||
|
||||
pub fn annotate(&self, imgs: &[DynamicImage], ys: &[Ys]) {
|
||||
pub fn annotate(&self, imgs: &[DynamicImage], ys: &[Y]) {
|
||||
for (img, y) in imgs.iter().zip(ys.iter()) {
|
||||
let mut img_rgb = img.to_rgba8();
|
||||
|
||||
// masks
|
||||
if !self.without_polygons {
|
||||
if let Some(xs) = &y.masks {
|
||||
self.plot_polygons(&mut img_rgb, xs)
|
||||
if !self.without_masks {
|
||||
if let Some(xs) = &y.masks() {
|
||||
self.plot_masks_and_polygons(&mut img_rgb, xs)
|
||||
}
|
||||
}
|
||||
|
||||
// bboxes
|
||||
if !self.without_bboxes {
|
||||
if let Some(xs) = &y.bboxes {
|
||||
if let Some(xs) = &y.bboxes() {
|
||||
self.plot_bboxes(&mut img_rgb, xs)
|
||||
}
|
||||
}
|
||||
|
||||
// mbrs
|
||||
if !self.without_mbrs {
|
||||
if let Some(xs) = &y.mbrs() {
|
||||
self.plot_mbrs(&mut img_rgb, xs)
|
||||
}
|
||||
}
|
||||
|
||||
// keypoints
|
||||
if !self.without_keypoints {
|
||||
if let Some(xs) = &y.keypoints {
|
||||
if let Some(xs) = &y.keypoints() {
|
||||
self.plot_keypoints(&mut img_rgb, xs)
|
||||
}
|
||||
}
|
||||
|
||||
// probs
|
||||
if let Some(xs) = &y.probs {
|
||||
if let Some(xs) = &y.probs() {
|
||||
self.plot_probs(&mut img_rgb, xs)
|
||||
}
|
||||
|
||||
// save
|
||||
if let Some(saveout) = &self.saveout {
|
||||
self.save(&img_rgb, saveout);
|
||||
}
|
||||
@ -173,127 +307,149 @@ impl Annotator {
|
||||
|
||||
pub fn plot_bboxes(&self, img: &mut RgbaImage, bboxes: &[Bbox]) {
|
||||
for bbox in bboxes.iter() {
|
||||
// bboxes
|
||||
imageproc::drawing::draw_hollow_rect_mut(
|
||||
img,
|
||||
imageproc::rect::Rect::at(bbox.xmin().round() as i32, bbox.ymin().round() as i32)
|
||||
.of_size(bbox.width().round() as u32, bbox.height().round() as u32),
|
||||
image::Rgba(self.get_color(bbox.id()).into()),
|
||||
image::Rgba(self.get_color(bbox.id() as usize).into()),
|
||||
);
|
||||
|
||||
// texts
|
||||
let mut legend = String::new();
|
||||
if !self.without_name {
|
||||
if !self.without_bboxes_name {
|
||||
legend.push_str(&bbox.name().unwrap_or(&bbox.id().to_string()).to_string());
|
||||
}
|
||||
if !self.without_conf {
|
||||
if !self.without_name {
|
||||
if !self.without_bboxes_conf {
|
||||
if !self.without_bboxes_name {
|
||||
legend.push_str(&format!(": {:.4}", bbox.confidence()));
|
||||
} else {
|
||||
legend.push_str(&format!("{:.4}", bbox.confidence()));
|
||||
}
|
||||
}
|
||||
if !legend.is_empty() {
|
||||
let scale_dy = img.width().max(img.height()) as f32 / 40.0;
|
||||
let scale = PxScale::from(scale_dy);
|
||||
let (text_w, text_h) = imageproc::drawing::text_size(scale, &self.font, &legend); // u32
|
||||
let text_h = text_h + text_h / 3;
|
||||
let top = if bbox.ymin() > text_h as f32 {
|
||||
(bbox.ymin().round() as u32 - text_h) as i32
|
||||
} else {
|
||||
(text_h - bbox.ymin().round() as u32) as i32
|
||||
};
|
||||
let mut left = bbox.xmin() as i32;
|
||||
if left + text_w as i32 > img.width() as i32 {
|
||||
left = img.width() as i32 - text_w as i32;
|
||||
}
|
||||
imageproc::drawing::draw_filled_rect_mut(
|
||||
img,
|
||||
imageproc::rect::Rect::at(left, top).of_size(text_w, text_h),
|
||||
image::Rgba(self.get_color(bbox.id()).into()),
|
||||
);
|
||||
imageproc::drawing::draw_text_mut(
|
||||
img,
|
||||
image::Rgba([0, 0, 0, 255]),
|
||||
left,
|
||||
top - (scale_dy / self.scale_).floor() as i32 + 2,
|
||||
scale,
|
||||
&self.font,
|
||||
&legend,
|
||||
);
|
||||
}
|
||||
self.put_text(
|
||||
img,
|
||||
legend.as_str(),
|
||||
bbox.xmin(),
|
||||
bbox.ymin(),
|
||||
image::Rgba(self.get_color(bbox.id() as usize).into()),
|
||||
self.bboxes_text_color,
|
||||
self.without_bboxes_text_bg,
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
pub fn plot_polygons(&self, img: &mut RgbaImage, masks: &[Mask]) {
|
||||
pub fn plot_mbrs(&self, img: &mut RgbaImage, mbrs: &[Mbr]) {
|
||||
for mbr in mbrs.iter() {
|
||||
// mbrs
|
||||
for i in 0..mbr.vertices().len() {
|
||||
let p1 = mbr.vertices()[i];
|
||||
let p2 = mbr.vertices()[(i + 1) % mbr.vertices().len()];
|
||||
imageproc::drawing::draw_line_segment_mut(
|
||||
img,
|
||||
(p1.x.round() as f32, p1.y.round() as f32),
|
||||
(p2.x.round() as f32, p2.y.round() as f32),
|
||||
image::Rgba(self.get_color(mbr.id() as usize).into()),
|
||||
);
|
||||
}
|
||||
|
||||
// text
|
||||
let mut legend = String::new();
|
||||
if !self.without_mbrs_name {
|
||||
legend.push_str(&mbr.name().unwrap_or(&mbr.id().to_string()).to_string());
|
||||
}
|
||||
if !self.without_mbrs_conf {
|
||||
if !self.without_mbrs_name {
|
||||
legend.push_str(&format!(": {:.4}", mbr.confidence()));
|
||||
} else {
|
||||
legend.push_str(&format!("{:.4}", mbr.confidence()));
|
||||
}
|
||||
}
|
||||
self.put_text(
|
||||
img,
|
||||
legend.as_str(),
|
||||
mbr.top().x as f32,
|
||||
mbr.top().y as f32,
|
||||
image::Rgba(self.get_color(mbr.id() as usize).into()),
|
||||
self.mbrs_text_color,
|
||||
self.without_mbrs_text_bg,
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
pub fn plot_masks_and_polygons(&self, img: &mut RgbaImage, masks: &[Mask]) {
|
||||
let mut convas = img.clone();
|
||||
for mask in masks.iter() {
|
||||
// mask
|
||||
let mut polygon_i32 = mask
|
||||
.polygon
|
||||
.points
|
||||
.iter()
|
||||
.map(|p| imageproc::point::Point::new(p.x as i32, p.y as i32))
|
||||
// masks
|
||||
let polygon_i32 = mask
|
||||
.polygon()
|
||||
.exterior()
|
||||
.points()
|
||||
.take(if mask.is_closed() {
|
||||
mask.count() - 1
|
||||
} else {
|
||||
mask.count()
|
||||
})
|
||||
.map(|p| imageproc::point::Point::new(p.x() as i32, p.y() as i32))
|
||||
.collect::<Vec<_>>();
|
||||
if polygon_i32.first() == polygon_i32.last() {
|
||||
polygon_i32.pop();
|
||||
}
|
||||
let mut mask_color = self.get_color(mask.id);
|
||||
mask_color.3 = self.mask_alpha;
|
||||
let mut mask_color = self.get_color(mask.id() as usize);
|
||||
mask_color.3 = self.masks_alpha;
|
||||
imageproc::drawing::draw_polygon_mut(
|
||||
&mut convas,
|
||||
&polygon_i32,
|
||||
Rgba(mask_color.into()),
|
||||
);
|
||||
|
||||
// contour
|
||||
let polygon_f32 = mask
|
||||
.polygon
|
||||
.points
|
||||
.iter()
|
||||
.map(|p| imageproc::point::Point::new(p.x, p.y))
|
||||
.collect::<Vec<_>>();
|
||||
imageproc::drawing::draw_hollow_polygon_mut(img, &polygon_f32, self.polygon_color);
|
||||
|
||||
// text
|
||||
let mut legend = String::new();
|
||||
if self.with_masks_name {
|
||||
legend.push_str(&mask.name().unwrap_or(&mask.id().to_string()).to_string());
|
||||
}
|
||||
if !legend.is_empty() {
|
||||
let scale_dy = img.width().max(img.height()) as f32 / 60.0;
|
||||
let scale = PxScale::from(scale_dy);
|
||||
let (text_w, text_h) = imageproc::drawing::text_size(scale, &self.font, &legend); // u32
|
||||
let text_h = text_h + text_h / 3;
|
||||
let bbox = mask.polygon.find_min_rect();
|
||||
let top = (bbox.cy().round() as u32 - text_h) as i32;
|
||||
let mut left = (bbox.cx() as i32 - text_w as i32 / 2).max(0);
|
||||
if left + text_w as i32 > img.width() as i32 {
|
||||
left = img.width() as i32 - text_w as i32;
|
||||
}
|
||||
imageproc::drawing::draw_filled_rect_mut(
|
||||
&mut convas,
|
||||
imageproc::rect::Rect::at(left, top).of_size(text_w, text_h),
|
||||
image::Rgba(self.get_color(mask.id()).into()),
|
||||
);
|
||||
imageproc::drawing::draw_text_mut(
|
||||
&mut convas,
|
||||
image::Rgba([0, 0, 0, 255]),
|
||||
left,
|
||||
top - (scale_dy / self.scale_).floor() as i32 + 2,
|
||||
scale,
|
||||
&self.font,
|
||||
&legend,
|
||||
);
|
||||
// contours(polygons)
|
||||
if !self.without_polygons {
|
||||
let polygon_f32 = mask
|
||||
.polygon()
|
||||
.exterior()
|
||||
.points()
|
||||
.take(if mask.is_closed() {
|
||||
mask.count() - 1
|
||||
} else {
|
||||
mask.count()
|
||||
})
|
||||
.map(|p| imageproc::point::Point::new(p.x() as f32, p.y() as f32))
|
||||
.collect::<Vec<_>>();
|
||||
imageproc::drawing::draw_hollow_polygon_mut(img, &polygon_f32, self.polygon_color);
|
||||
}
|
||||
}
|
||||
image::imageops::overlay(img, &convas, 0, 0);
|
||||
|
||||
// text on top
|
||||
for mask in masks.iter() {
|
||||
if let Some((x, y)) = mask.centroid() {
|
||||
let mut legend = String::new();
|
||||
if self.with_masks_name {
|
||||
legend.push_str(&mask.name().unwrap_or(&mask.id().to_string()).to_string());
|
||||
}
|
||||
if self.with_masks_conf {
|
||||
if self.with_masks_name {
|
||||
legend.push_str(&format!(": {:.4}", mask.confidence()));
|
||||
} else {
|
||||
legend.push_str(&format!("{:.4}", mask.confidence()));
|
||||
}
|
||||
}
|
||||
self.put_text(
|
||||
img,
|
||||
legend.as_str(),
|
||||
x,
|
||||
y,
|
||||
image::Rgba(self.get_color(mask.id() as usize).into()),
|
||||
self.masks_text_color,
|
||||
!self.with_masks_text_bg,
|
||||
);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
pub fn plot_probs(&self, img: &mut RgbaImage, probs: &Embedding) {
|
||||
let topk = 5usize;
|
||||
pub fn plot_probs(&self, img: &mut RgbaImage, probs: &Prob) {
|
||||
let (x, mut y) = (img.width() as i32 / 20, img.height() as i32 / 20);
|
||||
for k in probs.topk(topk).iter() {
|
||||
for k in probs.topk(self.probs_topk).iter() {
|
||||
let legend = format!("{}: {:.4}", k.2.as_ref().unwrap_or(&k.0.to_string()), k.1);
|
||||
let scale_dy = img.width().max(img.height()) as f32 / 30.0;
|
||||
let scale = PxScale::from(scale_dy);
|
||||
let scale = PxScale::from(self.scale_dy);
|
||||
let (text_w, text_h) = imageproc::drawing::text_size(scale, &self.font, &legend);
|
||||
let text_h = text_h + text_h / 3;
|
||||
y += text_h as i32;
|
||||
@ -306,7 +462,7 @@ impl Annotator {
|
||||
img,
|
||||
image::Rgba([0, 0, 0, 255]),
|
||||
x,
|
||||
y - (scale_dy / self.scale_).floor() as i32 + 2,
|
||||
y - (self.scale_dy / self._scale).floor() as i32 + 2,
|
||||
scale,
|
||||
&self.font,
|
||||
&legend,
|
||||
@ -320,12 +476,20 @@ impl Annotator {
|
||||
if kpt.confidence() == 0.0 {
|
||||
continue;
|
||||
}
|
||||
|
||||
// keypoints
|
||||
let color = match &self.keypoints_palette {
|
||||
None => self.get_color(i + 10),
|
||||
Some(keypoints_palette) => keypoints_palette[i],
|
||||
};
|
||||
imageproc::drawing::draw_filled_circle_mut(
|
||||
img,
|
||||
(kpt.x() as i32, kpt.y() as i32),
|
||||
self.keypoint_radius as i32,
|
||||
image::Rgba(self.get_color(i + 10).into()),
|
||||
self.keypoints_radius as i32,
|
||||
image::Rgba(color.into()),
|
||||
);
|
||||
|
||||
// text
|
||||
let mut legend = String::new();
|
||||
if self.with_keypoints_name {
|
||||
legend.push_str(&kpt.name().unwrap_or(&kpt.id().to_string()).to_string());
|
||||
@ -337,37 +501,15 @@ impl Annotator {
|
||||
legend.push_str(&format!("{:.4}", kpt.confidence()));
|
||||
}
|
||||
}
|
||||
if !legend.is_empty() {
|
||||
let scale_dy = img.width().max(img.height()) as f32 / 80.0;
|
||||
let scale = PxScale::from(scale_dy);
|
||||
let (text_w, text_h) =
|
||||
imageproc::drawing::text_size(scale, &self.font, &legend); // u32
|
||||
let text_h = text_h + text_h / 3;
|
||||
let top = if kpt.y() > text_h as f32 {
|
||||
(kpt.y().round() as u32 - text_h - self.keypoint_radius as u32) as i32
|
||||
} else {
|
||||
(text_h - self.keypoint_radius as u32 - kpt.y().round() as u32) as i32
|
||||
};
|
||||
let mut left =
|
||||
(kpt.x() as i32 - self.keypoint_radius as i32 - text_w as i32 / 2).max(0);
|
||||
if left + text_w as i32 > img.width() as i32 {
|
||||
left = img.width() as i32 - text_w as i32;
|
||||
}
|
||||
imageproc::drawing::draw_filled_rect_mut(
|
||||
img,
|
||||
imageproc::rect::Rect::at(left, top).of_size(text_w, text_h),
|
||||
image::Rgba(self.get_color(kpt.id() as usize).into()),
|
||||
);
|
||||
imageproc::drawing::draw_text_mut(
|
||||
img,
|
||||
image::Rgba([0, 0, 0, 255]),
|
||||
left,
|
||||
top - (scale_dy / self.scale_).floor() as i32 + 2,
|
||||
scale,
|
||||
&self.font,
|
||||
&legend,
|
||||
);
|
||||
}
|
||||
self.put_text(
|
||||
img,
|
||||
legend.as_str(),
|
||||
kpt.x(),
|
||||
kpt.y(),
|
||||
image::Rgba(self.get_color(kpt.id() as usize).into()),
|
||||
self.keypoints_text_color,
|
||||
self.without_keypoints_text_bg,
|
||||
);
|
||||
}
|
||||
|
||||
// draw skeleton
|
||||
@ -389,6 +531,53 @@ impl Annotator {
|
||||
}
|
||||
}
|
||||
|
||||
#[allow(clippy::too_many_arguments)]
|
||||
fn put_text(
|
||||
&self,
|
||||
img: &mut RgbaImage,
|
||||
legend: &str,
|
||||
x: f32,
|
||||
y: f32,
|
||||
color: Rgba<u8>,
|
||||
text_color: Rgba<u8>,
|
||||
without_text_bg: bool,
|
||||
) {
|
||||
if !legend.is_empty() {
|
||||
let scale = PxScale::from(self.scale_dy);
|
||||
let (text_w, text_h) = imageproc::drawing::text_size(scale, &self.font, legend);
|
||||
let text_h = text_h + text_h / 3;
|
||||
let top = if y > text_h as f32 {
|
||||
(y.round() as u32 - text_h) as i32
|
||||
} else {
|
||||
(text_h - y.round() as u32) as i32
|
||||
};
|
||||
let mut left = x as i32;
|
||||
if left + text_w as i32 > img.width() as i32 {
|
||||
left = img.width() as i32 - text_w as i32;
|
||||
}
|
||||
|
||||
// text bbox
|
||||
if !without_text_bg {
|
||||
imageproc::drawing::draw_filled_rect_mut(
|
||||
img,
|
||||
imageproc::rect::Rect::at(left, top).of_size(text_w, text_h),
|
||||
color,
|
||||
);
|
||||
}
|
||||
|
||||
// text
|
||||
imageproc::drawing::draw_text_mut(
|
||||
img,
|
||||
text_color,
|
||||
left,
|
||||
top - (self.scale_dy / self._scale).floor() as i32 + 2,
|
||||
scale,
|
||||
&self.font,
|
||||
legend,
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
fn load_font(path: Option<&str>) -> Result<FontVec> {
|
||||
let path_font = match path {
|
||||
None => auto_load("Arial.ttf")?,
|
||||
@ -402,28 +591,28 @@ impl Annotator {
|
||||
Self::color_palette()[n % Self::color_palette().len()]
|
||||
}
|
||||
|
||||
fn color_palette() -> Vec<(u8, u8, u8, u8)> {
|
||||
vec![
|
||||
(0, 255, 0, 255),
|
||||
(255, 128, 0, 255),
|
||||
(0, 0, 255, 255),
|
||||
(255, 153, 51, 255),
|
||||
(255, 0, 0, 255),
|
||||
(255, 51, 255, 255),
|
||||
(102, 178, 255, 255),
|
||||
(51, 153, 255, 255),
|
||||
(255, 51, 51, 255),
|
||||
(153, 255, 153, 255),
|
||||
(102, 255, 102, 255),
|
||||
(153, 204, 255, 255),
|
||||
(255, 153, 153, 255),
|
||||
(255, 178, 102, 255),
|
||||
(230, 230, 0, 255),
|
||||
(255, 153, 255, 255),
|
||||
(255, 102, 255, 255),
|
||||
(255, 102, 102, 255),
|
||||
(51, 255, 51, 255),
|
||||
(255, 255, 255, 255),
|
||||
fn color_palette() -> [(u8, u8, u8, u8); 20] {
|
||||
[
|
||||
(0, 255, 127, 255), // spring green
|
||||
(255, 105, 180, 255), // hot pink
|
||||
(255, 99, 71, 255), // tomato
|
||||
(255, 215, 0, 255), // glod
|
||||
(188, 143, 143, 255), // rosy brown
|
||||
(0, 191, 255, 255), // deep sky blue
|
||||
(143, 188, 143, 255), // dark sea green
|
||||
(238, 130, 238, 255), // violet
|
||||
(154, 205, 50, 255), // yellow green
|
||||
(205, 133, 63, 255), // peru
|
||||
(30, 144, 255, 255), // dodger blue
|
||||
(112, 128, 144, 255), // slate gray
|
||||
(127, 255, 212, 255), // aqua marine
|
||||
(51, 153, 255, 255), // blue
|
||||
(0, 255, 255, 255), // cyan
|
||||
(138, 43, 226, 255), // blue violet
|
||||
(165, 42, 42, 255), // brown
|
||||
(216, 191, 216, 255), // thistle
|
||||
(240, 255, 255, 255), // azure
|
||||
(95, 158, 160, 255), // cadet blue
|
||||
]
|
||||
}
|
||||
}
|
||||
|
@ -1,78 +0,0 @@
|
||||
use crate::Rect;
|
||||
|
||||
#[derive(Clone, PartialEq, Default)]
|
||||
pub struct Bbox {
|
||||
rect: Rect,
|
||||
id: usize,
|
||||
confidence: f32,
|
||||
name: Option<String>,
|
||||
}
|
||||
|
||||
impl std::fmt::Debug for Bbox {
|
||||
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
|
||||
f.debug_struct("Bbox")
|
||||
.field("xmin", &self.rect.xmin())
|
||||
.field("ymin", &self.rect.ymin())
|
||||
.field("xmax", &self.rect.xmax())
|
||||
.field("ymax", &self.rect.ymax())
|
||||
.field("id", &self.id)
|
||||
.field("name", &self.name)
|
||||
.field("confidence", &self.confidence)
|
||||
.finish()
|
||||
}
|
||||
}
|
||||
|
||||
impl Bbox {
|
||||
pub fn new(rect: Rect, id: usize, confidence: f32, name: Option<String>) -> Self {
|
||||
Self {
|
||||
rect,
|
||||
id,
|
||||
confidence,
|
||||
name,
|
||||
}
|
||||
}
|
||||
|
||||
pub fn width(&self) -> f32 {
|
||||
self.rect.width()
|
||||
}
|
||||
|
||||
pub fn height(&self) -> f32 {
|
||||
self.rect.height()
|
||||
}
|
||||
|
||||
pub fn xmin(&self) -> f32 {
|
||||
self.rect.xmin()
|
||||
}
|
||||
|
||||
pub fn ymin(&self) -> f32 {
|
||||
self.rect.ymin()
|
||||
}
|
||||
|
||||
pub fn xmax(&self) -> f32 {
|
||||
self.rect.xmax()
|
||||
}
|
||||
|
||||
pub fn ymax(&self) -> f32 {
|
||||
self.rect.ymax()
|
||||
}
|
||||
|
||||
pub fn id(&self) -> usize {
|
||||
self.id
|
||||
}
|
||||
|
||||
pub fn name(&self) -> Option<&String> {
|
||||
self.name.as_ref()
|
||||
}
|
||||
|
||||
pub fn confidence(&self) -> f32 {
|
||||
self.confidence
|
||||
}
|
||||
|
||||
pub fn area(&self) -> f32 {
|
||||
self.rect.area()
|
||||
}
|
||||
|
||||
pub fn iou(&self, other: &Bbox) -> f32 {
|
||||
self.rect.intersect(&other.rect) / self.rect.union(&other.rect)
|
||||
}
|
||||
}
|
@ -5,12 +5,12 @@ use std::collections::VecDeque;
|
||||
use std::path::{Path, PathBuf};
|
||||
use walkdir::{DirEntry, WalkDir};
|
||||
|
||||
/// Dataloader for load images
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct DataLoader {
|
||||
// source could be single image path, folder with images (TODO: video, stream)
|
||||
pub paths: VecDeque<PathBuf>,
|
||||
pub recursive: bool,
|
||||
pub batch: usize,
|
||||
pub paths: VecDeque<PathBuf>,
|
||||
}
|
||||
|
||||
impl Iterator for DataLoader {
|
||||
|
@ -4,7 +4,7 @@ pub enum Device {
|
||||
Cuda(usize),
|
||||
Trt(usize),
|
||||
CoreML(usize),
|
||||
Cann(usize),
|
||||
// Cann(usize),
|
||||
// Acl(usize),
|
||||
// Rocm(usize),
|
||||
// Rknpu(usize),
|
||||
|
@ -1,5 +1,6 @@
|
||||
use std::ops::Index;
|
||||
|
||||
/// Dynamic Confidences
|
||||
#[derive(Clone, PartialEq, PartialOrd)]
|
||||
pub struct DynConf {
|
||||
confs: Vec<f32>,
|
||||
|
@ -8,6 +8,7 @@ use ort::{
|
||||
|
||||
use crate::{config_dir, Device, MinOptMax, Options, CHECK_MARK, CROSS_MARK, SAFE_CROSS_MARK};
|
||||
|
||||
/// ONNXRuntime Backend
|
||||
#[derive(Debug)]
|
||||
pub struct OrtEngine {
|
||||
session: Session,
|
||||
@ -145,8 +146,7 @@ impl OrtEngine {
|
||||
Device::Cpu(_) => {
|
||||
println!("{CHECK_MARK} Using CPU");
|
||||
ort::CPUExecutionProvider::default().build()
|
||||
}
|
||||
_ => todo!(),
|
||||
} // _ => todo!(),
|
||||
};
|
||||
let session = builder
|
||||
.with_optimization_level(ort::GraphOptimizationLevel::Level3)?
|
||||
|
@ -1,63 +0,0 @@
|
||||
use crate::Point;
|
||||
|
||||
#[derive(PartialEq, Clone)]
|
||||
pub struct Keypoint {
|
||||
pub point: Point,
|
||||
confidence: f32,
|
||||
id: isize,
|
||||
name: Option<String>,
|
||||
}
|
||||
|
||||
impl Default for Keypoint {
|
||||
fn default() -> Self {
|
||||
Self {
|
||||
id: -1,
|
||||
confidence: 0.0,
|
||||
point: Point::default(),
|
||||
name: None,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl std::fmt::Debug for Keypoint {
|
||||
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
|
||||
f.debug_struct("Keypoint")
|
||||
.field("x", &self.point.x)
|
||||
.field("y", &self.point.y)
|
||||
.field("confidence", &self.confidence)
|
||||
.field("id", &self.id)
|
||||
.field("name", &self.name)
|
||||
.finish()
|
||||
}
|
||||
}
|
||||
|
||||
impl Keypoint {
|
||||
pub fn new(point: Point, confidence: f32, id: isize, name: Option<String>) -> Self {
|
||||
Self {
|
||||
point,
|
||||
confidence,
|
||||
id,
|
||||
name,
|
||||
}
|
||||
}
|
||||
|
||||
pub fn x(&self) -> f32 {
|
||||
self.point.x
|
||||
}
|
||||
|
||||
pub fn y(&self) -> f32 {
|
||||
self.point.y
|
||||
}
|
||||
|
||||
pub fn confidence(&self) -> f32 {
|
||||
self.confidence
|
||||
}
|
||||
|
||||
pub fn id(&self) -> isize {
|
||||
self.id
|
||||
}
|
||||
|
||||
pub fn name(&self) -> Option<&String> {
|
||||
self.name.as_ref()
|
||||
}
|
||||
}
|
@ -1,6 +1,7 @@
|
||||
use anyhow::Result;
|
||||
use rand::distributions::{Distribution, WeightedIndex};
|
||||
|
||||
/// Logits Sampler
|
||||
#[derive(Debug)]
|
||||
pub struct LogitsSampler {
|
||||
temperature: f32,
|
||||
|
@ -1,28 +0,0 @@
|
||||
use crate::Polygon;
|
||||
|
||||
#[derive(Default, Clone, PartialEq)]
|
||||
pub struct Mask {
|
||||
pub polygon: Polygon,
|
||||
pub id: usize,
|
||||
pub name: Option<String>,
|
||||
}
|
||||
|
||||
impl std::fmt::Debug for Mask {
|
||||
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
|
||||
f.debug_struct("Mask")
|
||||
.field("polygons(num_points)", &self.polygon.points.len())
|
||||
.field("id", &self.id)
|
||||
.field("name", &self.name)
|
||||
.finish()
|
||||
}
|
||||
}
|
||||
|
||||
impl Mask {
|
||||
pub fn id(&self) -> usize {
|
||||
self.id
|
||||
}
|
||||
|
||||
pub fn name(&self) -> Option<&String> {
|
||||
self.name.as_ref()
|
||||
}
|
||||
}
|
@ -1,3 +1,4 @@
|
||||
/// A value composed of Min-Opt-Max
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct MinOptMax {
|
||||
pub min: isize,
|
||||
|
@ -1,45 +1,22 @@
|
||||
mod annotator;
|
||||
mod bbox;
|
||||
mod dataloader;
|
||||
mod device;
|
||||
mod dynconf;
|
||||
mod embedding;
|
||||
mod engine;
|
||||
mod keypoint;
|
||||
mod logits_sampler;
|
||||
mod mask;
|
||||
mod metric;
|
||||
mod min_opt_max;
|
||||
pub mod ops;
|
||||
mod options;
|
||||
mod point;
|
||||
mod polygon;
|
||||
mod rect;
|
||||
mod rotated_rect;
|
||||
mod tokenizer_stream;
|
||||
mod utils;
|
||||
mod ys;
|
||||
|
||||
pub use annotator::Annotator;
|
||||
pub use bbox::Bbox;
|
||||
pub use dataloader::DataLoader;
|
||||
pub use device::Device;
|
||||
pub use dynconf::DynConf;
|
||||
pub use embedding::Embedding;
|
||||
pub use engine::OrtEngine;
|
||||
pub use keypoint::Keypoint;
|
||||
pub use logits_sampler::LogitsSampler;
|
||||
pub use mask::Mask;
|
||||
pub use metric::Metric;
|
||||
pub use min_opt_max::MinOptMax;
|
||||
pub use options::Options;
|
||||
pub use point::Point;
|
||||
pub use polygon::Polygon;
|
||||
pub use rect::Rect;
|
||||
pub use rotated_rect::RotatedRect;
|
||||
pub use tokenizer_stream::TokenizerStream;
|
||||
pub use utils::{
|
||||
auto_load, config_dir, download, string_now, COCO_KEYPOINT_NAMES_17, COCO_NAMES_80,
|
||||
COCO_SKELETON_17,
|
||||
};
|
||||
pub use ys::Ys;
|
||||
|
@ -1,7 +1,6 @@
|
||||
use crate::{Mask, Polygon};
|
||||
use anyhow::Result;
|
||||
use image::{DynamicImage, GenericImageView, GrayImage, ImageBuffer};
|
||||
use ndarray::{Array, Axis, Ix2, IxDyn};
|
||||
use image::{DynamicImage, GenericImageView, ImageBuffer};
|
||||
use ndarray::{Array, Axis, IxDyn};
|
||||
|
||||
pub fn standardize(xs: Array<f32, IxDyn>, mean: &[f32], std: &[f32]) -> Array<f32, IxDyn> {
|
||||
let mean = Array::from_shape_vec((1, mean.len(), 1, 1), mean.to_vec()).unwrap();
|
||||
@ -22,18 +21,6 @@ pub fn norm2(xs: &Array<f32, IxDyn>) -> Array<f32, IxDyn> {
|
||||
xs / std_
|
||||
}
|
||||
|
||||
pub fn dot2(query: &Array<f32, IxDyn>, gallery: &Array<f32, IxDyn>) -> Result<Vec<Vec<f32>>> {
|
||||
// (m, ndim) * (n, ndim).t => (m, n)
|
||||
let query = query.to_owned().into_dimensionality::<Ix2>()?;
|
||||
let gallery = gallery.to_owned().into_dimensionality::<Ix2>()?;
|
||||
let matrix = query.dot(&gallery.t());
|
||||
let exps = matrix.mapv(|x| x.exp());
|
||||
let stds = exps.sum_axis(Axis(1));
|
||||
let matrix = exps / stds.insert_axis(Axis(1));
|
||||
let matrix: Vec<Vec<f32>> = matrix.axis_iter(Axis(0)).map(|row| row.to_vec()).collect();
|
||||
Ok(matrix)
|
||||
}
|
||||
|
||||
pub fn scale_wh(w0: f32, h0: f32, w1: f32, h1: f32) -> (f32, f32, f32) {
|
||||
let r = (w1 / w0).min(h1 / h0);
|
||||
(r, (w0 * r).round(), (h0 * r).round())
|
||||
@ -61,6 +48,7 @@ pub fn letterbox(
|
||||
width: u32,
|
||||
bg: f32,
|
||||
) -> Result<Array<f32, IxDyn>> {
|
||||
// TODO: refactor
|
||||
let mut ys = Array::ones((xs.len(), 3, height as usize, width as usize)).into_dyn();
|
||||
ys.fill(bg);
|
||||
for (idx, x) in xs.iter().enumerate() {
|
||||
@ -121,26 +109,3 @@ pub fn descale_mask(mask: DynamicImage, w0: f32, h0: f32, w1: f32, h1: f32) -> D
|
||||
let mask = mask.crop(0, 0, w as u32, h as u32);
|
||||
mask.resize_exact(w1 as u32, h1 as u32, image::imageops::FilterType::Triangle)
|
||||
}
|
||||
|
||||
pub fn get_masks_from_image(
|
||||
mask: GrayImage,
|
||||
thresh: u8,
|
||||
id: usize,
|
||||
name: Option<String>,
|
||||
) -> Vec<Mask> {
|
||||
// let mask = mask.into_luma8();
|
||||
let contours: Vec<imageproc::contours::Contour<i32>> =
|
||||
imageproc::contours::find_contours_with_threshold(&mask, thresh);
|
||||
let mut masks: Vec<Mask> = Vec::new();
|
||||
contours.iter().for_each(|contour| {
|
||||
// contour.border_type == imageproc::contours::BorderType::Outer &&
|
||||
if contour.points.len() > 2 {
|
||||
masks.push(Mask {
|
||||
polygon: Polygon::from_contour(contour),
|
||||
id,
|
||||
name: name.to_owned(),
|
||||
});
|
||||
}
|
||||
});
|
||||
masks
|
||||
}
|
||||
|
@ -1,5 +1,6 @@
|
||||
use crate::{auto_load, Device, MinOptMax};
|
||||
use crate::{auto_load, models::YOLOTask, Device, MinOptMax};
|
||||
|
||||
/// Options for building models
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct Options {
|
||||
pub onnx_path: String,
|
||||
@ -47,11 +48,15 @@ pub struct Options {
|
||||
pub tokenizer: Option<String>,
|
||||
pub vocab: Option<String>,
|
||||
pub names: Option<Vec<String>>, // names
|
||||
pub names2: Option<Vec<String>>, // names2, could be keypoints names
|
||||
pub anchors_first: bool, // otuput format: [bs, anchors/na, pos+nc+nm]
|
||||
pub names2: Option<Vec<String>>, // names2: could be keypoints names
|
||||
pub names3: Option<Vec<String>>, // names3
|
||||
pub min_width: Option<f32>,
|
||||
pub min_height: Option<f32>,
|
||||
pub unclip_ratio: f32, // DB
|
||||
pub yolo_task: Option<YOLOTask>,
|
||||
pub anchors_first: bool, // yolo model output format like: [batch_size, anchors, xywh_clss_xxx]
|
||||
pub conf_independent: bool, // xywh_conf_clss
|
||||
pub apply_probs_softmax: bool,
|
||||
}
|
||||
|
||||
impl Default for Options {
|
||||
@ -99,10 +104,14 @@ impl Default for Options {
|
||||
vocab: None,
|
||||
names: None,
|
||||
names2: None,
|
||||
anchors_first: false,
|
||||
names3: None,
|
||||
min_width: None,
|
||||
min_height: None,
|
||||
unclip_ratio: 1.5,
|
||||
yolo_task: None,
|
||||
anchors_first: false,
|
||||
conf_independent: false,
|
||||
apply_probs_softmax: false,
|
||||
}
|
||||
}
|
||||
}
|
||||
@ -143,6 +152,21 @@ impl Options {
|
||||
self
|
||||
}
|
||||
|
||||
pub fn with_yolo_task(mut self, x: YOLOTask) -> Self {
|
||||
self.yolo_task = Some(x);
|
||||
self
|
||||
}
|
||||
|
||||
pub fn with_conf_independent(mut self, x: bool) -> Self {
|
||||
self.conf_independent = x;
|
||||
self
|
||||
}
|
||||
|
||||
pub fn apply_probs_softmax(mut self, x: bool) -> Self {
|
||||
self.apply_probs_softmax = x;
|
||||
self
|
||||
}
|
||||
|
||||
pub fn with_profile(mut self, profile: bool) -> Self {
|
||||
self.profile = profile;
|
||||
self
|
||||
@ -158,6 +182,11 @@ impl Options {
|
||||
self
|
||||
}
|
||||
|
||||
pub fn with_names3(mut self, names: &[&str]) -> Self {
|
||||
self.names3 = Some(names.iter().map(|x| x.to_string()).collect::<Vec<String>>());
|
||||
self
|
||||
}
|
||||
|
||||
pub fn with_vocab(mut self, vocab: &str) -> Self {
|
||||
self.vocab = Some(auto_load(vocab).unwrap());
|
||||
self
|
||||
@ -183,8 +212,8 @@ impl Options {
|
||||
self
|
||||
}
|
||||
|
||||
pub fn with_anchors_first(mut self) -> Self {
|
||||
self.anchors_first = true;
|
||||
pub fn with_anchors_first(mut self, x: bool) -> Self {
|
||||
self.anchors_first = x;
|
||||
self
|
||||
}
|
||||
|
||||
|
@ -1,194 +0,0 @@
|
||||
use std::ops::{Add, Div, Mul, Sub};
|
||||
|
||||
#[derive(Default, Debug, PartialOrd, PartialEq, Clone, Copy)]
|
||||
pub struct Point {
|
||||
pub x: f32,
|
||||
pub y: f32,
|
||||
}
|
||||
|
||||
impl Add for Point {
|
||||
type Output = Self;
|
||||
|
||||
fn add(self, other: Self) -> Self::Output {
|
||||
Self {
|
||||
x: self.x + other.x,
|
||||
y: self.y + other.y,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl Add<f32> for Point {
|
||||
type Output = Self;
|
||||
|
||||
fn add(self, other: f32) -> Self::Output {
|
||||
Self {
|
||||
x: self.x + other,
|
||||
y: self.y + other,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl Sub for Point {
|
||||
type Output = Self;
|
||||
|
||||
fn sub(self, other: Self) -> Self::Output {
|
||||
Self {
|
||||
x: self.x - other.x,
|
||||
y: self.y - other.y,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl Sub<f32> for Point {
|
||||
type Output = Self;
|
||||
|
||||
fn sub(self, other: f32) -> Self::Output {
|
||||
Self {
|
||||
x: self.x * other,
|
||||
y: self.y * other,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl Mul<f32> for Point {
|
||||
type Output = Self;
|
||||
|
||||
fn mul(self, other: f32) -> Self::Output {
|
||||
Self {
|
||||
x: self.x * other,
|
||||
y: self.y * other,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl Mul for Point {
|
||||
type Output = Self;
|
||||
|
||||
fn mul(self, other: Self) -> Self::Output {
|
||||
Self {
|
||||
x: self.x * other.x,
|
||||
y: self.y * other.y,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl Div for Point {
|
||||
type Output = Self;
|
||||
|
||||
fn div(self, other: Self) -> Self::Output {
|
||||
Self {
|
||||
x: self.x / other.x,
|
||||
y: self.y / other.y,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl Div<f32> for Point {
|
||||
type Output = Self;
|
||||
|
||||
fn div(self, other: f32) -> Self::Output {
|
||||
Self {
|
||||
x: self.x / other,
|
||||
y: self.y / other,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl From<(f32, f32)> for Point {
|
||||
fn from((x, y): (f32, f32)) -> Self {
|
||||
Self { x, y }
|
||||
}
|
||||
}
|
||||
|
||||
impl From<Point> for (f32, f32) {
|
||||
fn from(Point { x, y }: Point) -> Self {
|
||||
(x, y)
|
||||
}
|
||||
}
|
||||
|
||||
impl From<[f32; 2]> for Point {
|
||||
fn from([x, y]: [f32; 2]) -> Self {
|
||||
Self { x, y }
|
||||
}
|
||||
}
|
||||
|
||||
impl From<Point> for [f32; 2] {
|
||||
fn from(Point { x, y }: Point) -> Self {
|
||||
[x, y]
|
||||
}
|
||||
}
|
||||
|
||||
impl Point {
|
||||
pub fn new(x: f32, y: f32) -> Self {
|
||||
Self { x, y }
|
||||
}
|
||||
|
||||
pub fn coord(&self) -> [f32; 2] {
|
||||
[self.x, self.y]
|
||||
}
|
||||
|
||||
pub fn is_origin(&self) -> bool {
|
||||
self.x == 0.0_f32 && self.y == 0.0_f32
|
||||
}
|
||||
|
||||
pub fn distance_from(&self, other: &Point) -> f32 {
|
||||
((self.x - other.x).powf(2.0) + (self.y - other.y).powf(2.0)).sqrt()
|
||||
}
|
||||
|
||||
pub fn distance_from_origin(&self) -> f32 {
|
||||
(self.x.powf(2.0) + self.y.powf(2.0)).sqrt()
|
||||
}
|
||||
|
||||
pub fn sum(&self) -> f32 {
|
||||
self.x + self.y
|
||||
}
|
||||
|
||||
pub fn perpendicular_distance(&self, start: &Point, end: &Point) -> f32 {
|
||||
let numerator = ((end.y - start.y) * self.x - (end.x - start.x) * self.y + end.x * start.y
|
||||
- end.y * start.x)
|
||||
.abs();
|
||||
let denominator = ((end.y - start.y).powi(2) + (end.x - start.x).powi(2)).sqrt();
|
||||
numerator / denominator
|
||||
}
|
||||
|
||||
pub fn cross(&self, other: &Point) -> f32 {
|
||||
self.x * other.y - self.y * other.x
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests_points {
|
||||
use super::Point;
|
||||
|
||||
#[test]
|
||||
fn new() {
|
||||
let origin1 = Point::from((0.0f32, 0.0f32));
|
||||
let origin2 = Point::from([0.0f32, 0.0f32]);
|
||||
let origin3 = (0.0f32, 0.0f32).into();
|
||||
let origin4 = [0.0f32, 0.0f32].into();
|
||||
let origin5 = Point::new(1.0f32, 2.0f32);
|
||||
let origin6 = Point {
|
||||
x: 1.0f32,
|
||||
y: 2.0f32,
|
||||
};
|
||||
assert_eq!(origin1, origin2);
|
||||
assert_eq!(origin2, origin3);
|
||||
assert_eq!(origin3, origin4);
|
||||
assert_eq!(origin5, origin6);
|
||||
assert!(origin1.is_origin());
|
||||
assert!(origin2.is_origin());
|
||||
assert!(origin3.is_origin());
|
||||
assert!(origin4.is_origin());
|
||||
assert!(!origin5.is_origin());
|
||||
assert!(!origin6.is_origin());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn into_tuple_array() {
|
||||
let point = Point::from((1.0, 2.0));
|
||||
let tuple: (f32, f32) = point.into();
|
||||
let array: [f32; 2] = point.into();
|
||||
assert_eq!(tuple, (1.0, 2.0));
|
||||
assert_eq!(array, [1.0, 2.0]);
|
||||
}
|
||||
}
|
@ -1,239 +0,0 @@
|
||||
use crate::{Point, Rect};
|
||||
|
||||
#[derive(Default, Debug, Clone, PartialEq)]
|
||||
pub struct Polygon {
|
||||
pub points: Vec<Point>,
|
||||
}
|
||||
|
||||
impl From<Vec<Point>> for Polygon {
|
||||
fn from(points: Vec<Point>) -> Self {
|
||||
Self { points }
|
||||
}
|
||||
}
|
||||
|
||||
impl Polygon {
|
||||
pub fn new() -> Self {
|
||||
Self::default()
|
||||
}
|
||||
|
||||
pub fn from_contour(contour: &imageproc::contours::Contour<i32>) -> Self {
|
||||
let points = contour
|
||||
.points
|
||||
.iter()
|
||||
.map(|p| Point::new(p.x as f32, p.y as f32))
|
||||
.collect::<Vec<_>>();
|
||||
Self { points }
|
||||
}
|
||||
|
||||
pub fn to_imageproc_points(&self) -> Vec<imageproc::point::Point<i32>> {
|
||||
self.points
|
||||
.iter()
|
||||
.map(|p| imageproc::point::Point::new(p.x as i32, p.y as i32))
|
||||
.collect::<Vec<_>>()
|
||||
}
|
||||
|
||||
pub fn from_imageproc_points(points: &[imageproc::point::Point<i32>]) -> Self {
|
||||
let points = points
|
||||
.iter()
|
||||
.map(|p| Point::new(p.x as f32, p.y as f32))
|
||||
.collect::<Vec<_>>();
|
||||
Self { points }
|
||||
}
|
||||
|
||||
pub fn with_points(mut self, points: &[Point]) {
|
||||
self.points = points.to_vec();
|
||||
}
|
||||
|
||||
pub fn area(&self) -> f32 {
|
||||
// make sure points are already sorted
|
||||
let mut area = 0.0;
|
||||
let n = self.points.len();
|
||||
for i in 0..n {
|
||||
let j = (i + 1) % n;
|
||||
area += self.points[i].x * self.points[j].y;
|
||||
area -= self.points[j].x * self.points[i].y;
|
||||
}
|
||||
area.abs() / 2.0
|
||||
}
|
||||
|
||||
pub fn center(&self) -> Point {
|
||||
let rect = self.find_min_rect();
|
||||
rect.center()
|
||||
}
|
||||
|
||||
pub fn find_min_rect(&self) -> Rect {
|
||||
let (mut min_x, mut min_y, mut max_x, mut max_y) = (f32::MAX, f32::MAX, f32::MIN, f32::MIN);
|
||||
for point in self.points.iter() {
|
||||
if point.x <= min_x {
|
||||
min_x = point.x
|
||||
}
|
||||
if point.x > max_x {
|
||||
max_x = point.x
|
||||
}
|
||||
if point.y <= min_y {
|
||||
min_y = point.y
|
||||
}
|
||||
if point.y > max_y {
|
||||
max_y = point.y
|
||||
}
|
||||
}
|
||||
((min_x - 1.0, min_y - 1.0), (max_x + 1.0, max_y + 1.0)).into()
|
||||
}
|
||||
|
||||
pub fn perimeter(&self) -> f32 {
|
||||
let mut perimeter = 0.0;
|
||||
let n = self.points.len();
|
||||
for i in 0..n {
|
||||
let j = (i + 1) % n;
|
||||
perimeter += self.points[i].distance_from(&self.points[j]);
|
||||
}
|
||||
perimeter
|
||||
}
|
||||
|
||||
pub fn offset(&self, delta: f32, width: f32, height: f32) -> Self {
|
||||
let num_points = self.points.len();
|
||||
let mut new_points = Vec::with_capacity(self.points.len());
|
||||
for i in 0..num_points {
|
||||
let prev_idx = if i == 0 { num_points - 1 } else { i - 1 };
|
||||
let next_idx = (i + 1) % num_points;
|
||||
|
||||
let edge_vector = Point {
|
||||
x: self.points[next_idx].x - self.points[prev_idx].x,
|
||||
y: self.points[next_idx].y - self.points[prev_idx].y,
|
||||
};
|
||||
let normal_vector = Point {
|
||||
x: -edge_vector.y,
|
||||
y: edge_vector.x,
|
||||
};
|
||||
|
||||
let normal_length = (normal_vector.x.powi(2) + normal_vector.y.powi(2)).sqrt();
|
||||
if normal_length.abs() < 1e-6 {
|
||||
new_points.push(self.points[i]);
|
||||
} else {
|
||||
let normalized_normal = Point {
|
||||
x: normal_vector.x / normal_length,
|
||||
y: normal_vector.y / normal_length,
|
||||
};
|
||||
|
||||
let new_x = self.points[i].x + normalized_normal.x * delta;
|
||||
let new_y = self.points[i].y + normalized_normal.y * delta;
|
||||
let new_x = new_x.max(0.0).min(width);
|
||||
let new_y = new_y.max(0.0).min(height);
|
||||
new_points.push(Point { x: new_x, y: new_y });
|
||||
}
|
||||
}
|
||||
Self { points: new_points }
|
||||
}
|
||||
|
||||
pub fn resample(&self, num_samples: usize) -> Polygon {
|
||||
let mut points = Vec::new();
|
||||
for i in 0..self.points.len() {
|
||||
let start_point = self.points[i];
|
||||
let end_point = self.points[(i + 1) % self.points.len()];
|
||||
points.push(start_point);
|
||||
let dx = end_point.x - start_point.x;
|
||||
let dy = end_point.y - start_point.y;
|
||||
for j in 1..num_samples {
|
||||
let t = (j as f32) / (num_samples as f32);
|
||||
let new_x = start_point.x + t * dx;
|
||||
let new_y = start_point.y + t * dy;
|
||||
points.push(Point { x: new_x, y: new_y });
|
||||
}
|
||||
}
|
||||
Self { points }
|
||||
}
|
||||
|
||||
pub fn simplify(&self, epsilon: f32) -> Self {
|
||||
let mask = self.rdp_iter(epsilon);
|
||||
let points = self
|
||||
.points
|
||||
.iter()
|
||||
.enumerate()
|
||||
.filter_map(|(i, &point)| if mask[i] { Some(point) } else { None })
|
||||
.collect();
|
||||
Self { points }
|
||||
}
|
||||
|
||||
#[allow(clippy::needless_range_loop)]
|
||||
fn rdp_iter(&self, epsilon: f32) -> Vec<bool> {
|
||||
let mut stk = Vec::new();
|
||||
let mut indices = vec![true; self.points.len()];
|
||||
stk.push((0, self.points.len() - 1));
|
||||
while let Some((start_index, last_index)) = stk.pop() {
|
||||
let mut dmax = 0.0;
|
||||
let mut index = start_index;
|
||||
for i in (start_index + 1)..last_index {
|
||||
let d = self.points[i]
|
||||
.perpendicular_distance(&self.points[start_index], &self.points[last_index]);
|
||||
if d > dmax {
|
||||
index = i;
|
||||
dmax = d;
|
||||
}
|
||||
}
|
||||
|
||||
if dmax > epsilon {
|
||||
stk.push((start_index, index));
|
||||
stk.push((index, last_index));
|
||||
} else {
|
||||
for j in (start_index + 1)..last_index {
|
||||
indices[j] = false;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
indices
|
||||
}
|
||||
|
||||
pub fn convex_hull(&self) -> Self {
|
||||
let mut points = self.points.clone();
|
||||
points.sort_by(|a, b| {
|
||||
a.x.partial_cmp(&b.x)
|
||||
.unwrap()
|
||||
.then(a.y.partial_cmp(&b.y).unwrap())
|
||||
});
|
||||
let mut hull: Vec<Point> = Vec::new();
|
||||
|
||||
// Lower hull
|
||||
for &point in &points {
|
||||
while hull.len() >= 2 {
|
||||
let last = hull.len() - 1;
|
||||
let second_last = hull.len() - 2;
|
||||
let vec_a = hull[last] - hull[second_last];
|
||||
let vec_b = point - hull[second_last];
|
||||
|
||||
if vec_a.cross(&vec_b) <= 0.0 {
|
||||
hull.pop();
|
||||
} else {
|
||||
break;
|
||||
}
|
||||
}
|
||||
hull.push(point);
|
||||
}
|
||||
|
||||
// Upper hull
|
||||
let lower_hull_size = hull.len();
|
||||
for &point in points.iter().rev().skip(1) {
|
||||
while hull.len() > lower_hull_size {
|
||||
let last = hull.len() - 1;
|
||||
let second_last = hull.len() - 2;
|
||||
let vec_a: Point = hull[last] - hull[second_last];
|
||||
let vec_b = point - hull[second_last];
|
||||
|
||||
if vec_a.cross(&vec_b) <= 0.0 {
|
||||
hull.pop();
|
||||
} else {
|
||||
break;
|
||||
}
|
||||
}
|
||||
hull.push(point);
|
||||
}
|
||||
|
||||
// Remove duplicate points
|
||||
hull.dedup();
|
||||
if hull.len() > 1 && hull.first() == hull.last() {
|
||||
hull.pop();
|
||||
}
|
||||
|
||||
Self { points: hull }
|
||||
}
|
||||
}
|
206
src/core/rect.rs
@ -1,206 +0,0 @@
|
||||
use crate::Point;
|
||||
|
||||
#[derive(Default, PartialOrd, PartialEq, Clone, Copy)]
|
||||
pub struct Rect {
|
||||
top_left: Point,
|
||||
bottom_right: Point,
|
||||
}
|
||||
|
||||
impl std::fmt::Debug for Rect {
|
||||
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
|
||||
f.debug_struct("Rectangle")
|
||||
.field("xmin", &self.xmin())
|
||||
.field("ymin", &self.ymin())
|
||||
.field("xmax", &self.xmax())
|
||||
.field("ymax", &self.ymax())
|
||||
.finish()
|
||||
}
|
||||
}
|
||||
|
||||
impl<P: Into<Point>> From<(P, P)> for Rect {
|
||||
fn from((top_left, bottom_right): (P, P)) -> Self {
|
||||
Self {
|
||||
top_left: top_left.into(),
|
||||
bottom_right: bottom_right.into(),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl<P: Into<Point>> From<[P; 2]> for Rect {
|
||||
fn from([top_left, bottom_right]: [P; 2]) -> Self {
|
||||
Self {
|
||||
top_left: top_left.into(),
|
||||
bottom_right: bottom_right.into(),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl Rect {
|
||||
pub fn new(top_left: Point, bottom_right: Point) -> Self {
|
||||
Self {
|
||||
top_left,
|
||||
bottom_right,
|
||||
}
|
||||
}
|
||||
|
||||
pub fn from_xywh(x: f32, y: f32, w: f32, h: f32) -> Self {
|
||||
Self {
|
||||
top_left: Point::new(x, y),
|
||||
bottom_right: Point::new(x + w, y + h),
|
||||
}
|
||||
}
|
||||
|
||||
pub fn from_xyxy(x1: f32, y1: f32, x2: f32, y2: f32) -> Self {
|
||||
Self {
|
||||
top_left: Point::new(x1, y1),
|
||||
bottom_right: Point::new(x2, y2),
|
||||
}
|
||||
}
|
||||
|
||||
pub fn from_cxywh(cx: f32, cy: f32, w: f32, h: f32) -> Self {
|
||||
Self {
|
||||
top_left: Point::new(cx - w / 2.0, cy - h / 2.0),
|
||||
bottom_right: Point::new(cx + w / 2.0, cy + h / 2.0),
|
||||
}
|
||||
}
|
||||
|
||||
pub fn width(&self) -> f32 {
|
||||
(self.bottom_right - self.top_left).x
|
||||
}
|
||||
|
||||
pub fn height(&self) -> f32 {
|
||||
(self.bottom_right - self.top_left).y
|
||||
}
|
||||
|
||||
pub fn xmin(&self) -> f32 {
|
||||
self.top_left.x
|
||||
}
|
||||
|
||||
pub fn ymin(&self) -> f32 {
|
||||
self.top_left.y
|
||||
}
|
||||
|
||||
pub fn xmax(&self) -> f32 {
|
||||
self.bottom_right.x
|
||||
}
|
||||
|
||||
pub fn ymax(&self) -> f32 {
|
||||
self.bottom_right.y
|
||||
}
|
||||
|
||||
pub fn cx(&self) -> f32 {
|
||||
(self.bottom_right.x + self.top_left.x) / 2.0
|
||||
}
|
||||
|
||||
pub fn cy(&self) -> f32 {
|
||||
(self.bottom_right.y + self.top_left.y) / 2.0
|
||||
}
|
||||
|
||||
pub fn tl(&self) -> Point {
|
||||
self.top_left
|
||||
}
|
||||
|
||||
pub fn br(&self) -> Point {
|
||||
self.bottom_right
|
||||
}
|
||||
|
||||
pub fn tr(&self) -> Point {
|
||||
Point::new(self.bottom_right.x, self.top_left.y)
|
||||
}
|
||||
|
||||
pub fn bl(&self) -> Point {
|
||||
Point::new(self.top_left.x, self.bottom_right.y)
|
||||
}
|
||||
|
||||
pub fn center(&self) -> Point {
|
||||
(self.bottom_right + self.top_left) / 2.0
|
||||
}
|
||||
|
||||
pub fn area(&self) -> f32 {
|
||||
self.height() * self.width()
|
||||
}
|
||||
|
||||
pub fn perimeter(&self) -> f32 {
|
||||
(self.height() + self.width()) * 2.0
|
||||
}
|
||||
|
||||
pub fn is_empty(&self) -> bool {
|
||||
self.area() == 0.0
|
||||
}
|
||||
|
||||
pub fn is_squre(&self) -> bool {
|
||||
self.width() == self.height()
|
||||
}
|
||||
|
||||
pub fn intersect(&self, other: &Rect) -> f32 {
|
||||
let l = self.xmin().max(other.xmin());
|
||||
let r = (self.xmin() + self.width()).min(other.xmin() + other.width());
|
||||
let t = self.ymin().max(other.ymin());
|
||||
let b = (self.ymin() + self.height()).min(other.ymin() + other.height());
|
||||
(r - l).max(0.) * (b - t).max(0.)
|
||||
}
|
||||
|
||||
pub fn union(&self, other: &Rect) -> f32 {
|
||||
self.area() + other.area() - self.intersect(other)
|
||||
}
|
||||
|
||||
pub fn iou(&self, other: &Rect) -> f32 {
|
||||
self.intersect(other) / self.union(other)
|
||||
}
|
||||
|
||||
pub fn contains(&self, other: &Rect) -> bool {
|
||||
self.xmin() <= other.xmin()
|
||||
&& self.xmax() >= other.xmax()
|
||||
&& self.ymin() <= other.ymin()
|
||||
&& self.ymax() >= other.ymax()
|
||||
}
|
||||
|
||||
pub fn expand(&mut self, x: f32, y: f32, max_x: f32, max_y: f32) -> Self {
|
||||
Self::from_xyxy(
|
||||
(self.xmin() - x).max(0.0f32).min(max_x),
|
||||
(self.ymin() - y).max(0.0f32).min(max_y),
|
||||
(self.xmax() + x).max(0.0f32).min(max_x),
|
||||
(self.ymax() + y).max(0.0f32).min(max_y),
|
||||
)
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::Rect;
|
||||
use crate::Point;
|
||||
|
||||
#[test]
|
||||
fn new() {
|
||||
let rect1 = Rect {
|
||||
top_left: Point {
|
||||
x: 0.0f32,
|
||||
y: 0.0f32,
|
||||
},
|
||||
bottom_right: Point {
|
||||
x: 5.0f32,
|
||||
y: 5.0f32,
|
||||
},
|
||||
};
|
||||
let rect2 = Rect {
|
||||
top_left: (0.0f32, 0.0f32).into(),
|
||||
bottom_right: [5.0f32, 5.0f32].into(),
|
||||
};
|
||||
let rect3 = Rect::new([0.0, 0.0].into(), [5.0, 5.0].into());
|
||||
let rect4: Rect = ((0.0, 0.0), (5.0, 5.0)).into();
|
||||
let rect5: Rect = [(0.0, 0.0), (5.0, 5.0)].into();
|
||||
let rect6: Rect = ([0.0, 0.0], [5.0, 5.0]).into();
|
||||
let rect7: Rect = Rect::from(([0.0, 0.0], [5.0, 5.0]));
|
||||
let rect8: Rect = Rect::from([[0.0, 0.0], [5.0, 5.0]]);
|
||||
let rect9: Rect = Rect::from([(0.0, 0.0), (5.0, 5.0)]);
|
||||
let rect10: Rect = Rect::from_xyxy(0.0, 0.0, 5.0, 5.0);
|
||||
let rect11: Rect = Rect::from_xywh(0.0, 0.0, 5.0, 5.0);
|
||||
|
||||
assert_eq!(rect1, rect2);
|
||||
assert_eq!(rect3, rect4);
|
||||
assert_eq!(rect5, rect6);
|
||||
assert_eq!(rect7, rect8);
|
||||
assert_eq!(rect9, rect8);
|
||||
assert_eq!(rect10, rect11);
|
||||
}
|
||||
}
|
@ -1,155 +0,0 @@
|
||||
use crate::Point;
|
||||
|
||||
#[derive(Default, PartialOrd, PartialEq, Clone, Copy)]
|
||||
pub struct RotatedRect {
|
||||
center: Point,
|
||||
width: f32,
|
||||
height: f32,
|
||||
rotation: f32, // (0, 90) radians
|
||||
}
|
||||
|
||||
impl std::fmt::Debug for RotatedRect {
|
||||
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
|
||||
f.debug_struct("RotatedRectangle")
|
||||
.field("height", &self.height)
|
||||
.field("width", &self.width)
|
||||
.field("center", &self.center)
|
||||
.field("rotation", &self.rotation)
|
||||
.field("vertices", &self.vertices())
|
||||
.finish()
|
||||
}
|
||||
}
|
||||
|
||||
impl RotatedRect {
|
||||
pub fn new(center: Point, width: f32, height: f32, rotation: f32) -> Self {
|
||||
Self {
|
||||
center,
|
||||
width,
|
||||
height,
|
||||
rotation,
|
||||
}
|
||||
}
|
||||
|
||||
pub fn vertices(&self) -> [Point; 4] {
|
||||
// [cos -sin]
|
||||
// [sin cos]
|
||||
let m = [
|
||||
[
|
||||
self.rotation.cos() * 0.5 * self.width,
|
||||
-self.rotation.sin() * 0.5 * self.height,
|
||||
],
|
||||
[
|
||||
self.rotation.sin() * 0.5 * self.width,
|
||||
self.rotation.cos() * 0.5 * self.height,
|
||||
],
|
||||
];
|
||||
let v1 = self.center + Point::new(m[0][0] + m[0][1], m[1][0] + m[1][1]);
|
||||
let v2 = self.center + Point::new(m[0][0] - m[0][1], m[1][0] - m[1][1]);
|
||||
let v3 = self.center * 2.0 - v1;
|
||||
let v4 = self.center * 2.0 - v2;
|
||||
[v1, v2, v3, v4]
|
||||
}
|
||||
|
||||
pub fn height(&self) -> f32 {
|
||||
self.height
|
||||
}
|
||||
|
||||
pub fn width(&self) -> f32 {
|
||||
self.width
|
||||
}
|
||||
|
||||
pub fn center(&self) -> Point {
|
||||
self.center
|
||||
}
|
||||
|
||||
pub fn area(&self) -> f32 {
|
||||
self.height * self.width
|
||||
}
|
||||
|
||||
// pub fn contain_point(&self, point: Point) -> bool {
|
||||
// // ray casting
|
||||
// todo!()
|
||||
// }
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test1() {
|
||||
let pi = std::f32::consts::PI;
|
||||
let rt = RotatedRect::new(
|
||||
Point::new(0.0f32, 0.0f32),
|
||||
2.0f32,
|
||||
4.0f32,
|
||||
pi / 180.0 * 90.0,
|
||||
);
|
||||
|
||||
assert_eq!(
|
||||
rt.vertices(),
|
||||
[
|
||||
Point {
|
||||
x: -2.0,
|
||||
y: 0.99999994,
|
||||
},
|
||||
Point {
|
||||
x: 2.0,
|
||||
y: 1.0000001,
|
||||
},
|
||||
Point {
|
||||
x: 2.0,
|
||||
y: -0.99999994,
|
||||
},
|
||||
Point {
|
||||
x: -2.0,
|
||||
y: -1.0000001,
|
||||
},
|
||||
]
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test2() {
|
||||
let pi = std::f32::consts::PI;
|
||||
let rt = RotatedRect::new(
|
||||
Point::new(0.0f32, 0.0f32),
|
||||
2.0f32.sqrt(),
|
||||
2.0f32.sqrt(),
|
||||
pi / 180.0 * 45.0,
|
||||
);
|
||||
|
||||
assert_eq!(
|
||||
rt.vertices(),
|
||||
[
|
||||
Point {
|
||||
x: 0.0,
|
||||
y: 0.99999994
|
||||
},
|
||||
Point {
|
||||
x: 0.99999994,
|
||||
y: 0.0
|
||||
},
|
||||
Point {
|
||||
x: 0.0,
|
||||
y: -0.99999994
|
||||
},
|
||||
Point {
|
||||
x: -0.99999994,
|
||||
y: 0.0
|
||||
},
|
||||
]
|
||||
);
|
||||
}
|
||||
|
||||
// #[test]
|
||||
// fn contain_point() {
|
||||
// let pi = std::f32::consts::PI;
|
||||
// let rt = RotatedRect::new(
|
||||
// Point::new(0.0f32, 0.0f32),
|
||||
// 1.0f32.sqrt(),
|
||||
// 1.0f32.sqrt(),
|
||||
// pi / 180.0 * 45.0,
|
||||
// );
|
||||
|
||||
// assert!(rt.contain_point(Point::new(0.0, 0.0)));
|
||||
// assert!(rt.contain_point(Point::new(0.5, 0.0)));
|
||||
// assert!(rt.contain_point(Point::new(0.0, 0.5)));
|
||||
|
||||
// }
|
@ -1,79 +0,0 @@
|
||||
use crate::{Bbox, Embedding, Keypoint, Mask};
|
||||
|
||||
#[derive(Clone, PartialEq, Default)]
|
||||
pub struct Ys {
|
||||
// Results for each frame
|
||||
pub probs: Option<Embedding>,
|
||||
pub bboxes: Option<Vec<Bbox>>,
|
||||
pub keypoints: Option<Vec<Vec<Keypoint>>>,
|
||||
pub masks: Option<Vec<Mask>>,
|
||||
}
|
||||
|
||||
impl std::fmt::Debug for Ys {
|
||||
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
|
||||
f.debug_struct("Results")
|
||||
.field("Probabilities", &self.probs)
|
||||
.field("BoundingBoxes", &self.bboxes)
|
||||
.field("Keypoints", &self.keypoints)
|
||||
.field("Masks", &self.masks)
|
||||
.finish()
|
||||
}
|
||||
}
|
||||
|
||||
impl Ys {
|
||||
pub fn with_probs(mut self, probs: Embedding) -> Self {
|
||||
self.probs = Some(probs);
|
||||
self
|
||||
}
|
||||
|
||||
pub fn with_bboxes(mut self, bboxes: &[Bbox]) -> Self {
|
||||
self.bboxes = Some(bboxes.to_vec());
|
||||
self
|
||||
}
|
||||
|
||||
pub fn with_keypoints(mut self, keypoints: &[Vec<Keypoint>]) -> Self {
|
||||
self.keypoints = Some(keypoints.to_vec());
|
||||
self
|
||||
}
|
||||
|
||||
pub fn with_masks(mut self, masks: &[Mask]) -> Self {
|
||||
self.masks = Some(masks.to_vec());
|
||||
self
|
||||
}
|
||||
|
||||
pub fn probs(&self) -> Option<&Embedding> {
|
||||
self.probs.as_ref()
|
||||
}
|
||||
|
||||
pub fn keypoints(&self) -> Option<&Vec<Vec<Keypoint>>> {
|
||||
self.keypoints.as_ref()
|
||||
}
|
||||
|
||||
pub fn masks(&self) -> Option<&Vec<Mask>> {
|
||||
self.masks.as_ref()
|
||||
}
|
||||
|
||||
pub fn bboxes(&self) -> Option<&Vec<Bbox>> {
|
||||
self.bboxes.as_ref()
|
||||
}
|
||||
|
||||
pub fn non_max_suppression(xs: &mut Vec<Bbox>, iou_threshold: f32) {
|
||||
xs.sort_by(|b1, b2| b2.confidence().partial_cmp(&b1.confidence()).unwrap());
|
||||
let mut current_index = 0;
|
||||
for index in 0..xs.len() {
|
||||
let mut drop = false;
|
||||
for prev_index in 0..current_index {
|
||||
let iou = xs[prev_index].iou(&xs[index]);
|
||||
if iou > iou_threshold {
|
||||
drop = true;
|
||||
break;
|
||||
}
|
||||
}
|
||||
if !drop {
|
||||
xs.swap(current_index, index);
|
||||
current_index += 1;
|
||||
}
|
||||
}
|
||||
xs.truncate(current_index);
|
||||
}
|
||||
}
|
10
src/lib.rs
@ -1,8 +1,8 @@
|
||||
mod core;
|
||||
pub mod models;
|
||||
pub use core::*;
|
||||
mod utils;
|
||||
mod ys;
|
||||
|
||||
const GITHUB_ASSETS: &str = "https://github.com/jamjamjon/assets/releases/download/v0.0.1";
|
||||
const CHECK_MARK: &str = "✅";
|
||||
const CROSS_MARK: &str = "❌";
|
||||
const SAFE_CROSS_MARK: &str = "❎";
|
||||
pub use core::*;
|
||||
pub use utils::*;
|
||||
pub use ys::*;
|
||||
|
@ -4,7 +4,7 @@ use ndarray::{s, Array, Axis, IxDyn};
|
||||
use std::io::Write;
|
||||
use tokenizers::Tokenizer;
|
||||
|
||||
use crate::{ops, LogitsSampler, MinOptMax, Options, OrtEngine, TokenizerStream};
|
||||
use crate::{ops, Embedding, LogitsSampler, MinOptMax, Options, OrtEngine, TokenizerStream, Y};
|
||||
|
||||
#[derive(Debug)]
|
||||
pub struct Blip {
|
||||
@ -42,7 +42,7 @@ impl Blip {
|
||||
})
|
||||
}
|
||||
|
||||
pub fn encode_images(&self, xs: &[DynamicImage]) -> Result<Array<f32, IxDyn>> {
|
||||
pub fn encode_images(&self, xs: &[DynamicImage]) -> Result<Embedding> {
|
||||
let xs_ = ops::resize(xs, self.height.opt as u32, self.width.opt as u32)?;
|
||||
let xs_ = ops::normalize(xs_, 0.0, 255.0);
|
||||
let xs_ = ops::standardize(
|
||||
@ -51,24 +51,31 @@ impl Blip {
|
||||
&[0.26862954, 0.2613026, 0.2757771],
|
||||
);
|
||||
let ys: Vec<Array<f32, IxDyn>> = self.visual.run(&[xs_])?;
|
||||
let ys = ys[0].to_owned();
|
||||
Ok(ys)
|
||||
// let ys = ys[0].to_owned();
|
||||
Ok(Embedding::new(ys[0].to_owned()))
|
||||
// Ok(ys)
|
||||
}
|
||||
|
||||
pub fn caption(&mut self, path: &str, prompt: Option<&str>) -> Result<()> {
|
||||
// this demo use batch_size=1
|
||||
let x = image::io::Reader::open(path)?.decode()?;
|
||||
let image_embeds = self.encode_images(&[x])?;
|
||||
pub fn caption(
|
||||
&mut self,
|
||||
x: &[DynamicImage],
|
||||
prompt: Option<&str>,
|
||||
show: bool,
|
||||
) -> Result<Vec<Y>> {
|
||||
let mut ys: Vec<Y> = Vec::new();
|
||||
let image_embeds = self.encode_images(x)?;
|
||||
let image_embeds_attn_mask: Array<f32, IxDyn> =
|
||||
Array::ones((1, image_embeds.shape()[1])).into_dyn();
|
||||
Array::ones((1, image_embeds.embedding().shape()[1])).into_dyn();
|
||||
let mut y_text = String::new();
|
||||
|
||||
// conditional
|
||||
let mut input_ids = match prompt {
|
||||
None => {
|
||||
print!("[Unconditional]: ");
|
||||
if show {
|
||||
print!("[Unconditional]: ");
|
||||
}
|
||||
vec![0.0f32]
|
||||
}
|
||||
|
||||
Some(prompt) => {
|
||||
let encodings = self.tokenizer.tokenizer().encode(prompt, false);
|
||||
let ids: Vec<f32> = encodings
|
||||
@ -77,7 +84,10 @@ impl Blip {
|
||||
.iter()
|
||||
.map(|x| *x as f32)
|
||||
.collect();
|
||||
print!("[Conditional]: {} ", prompt);
|
||||
if show {
|
||||
print!("[Conditional]: {} ", prompt);
|
||||
}
|
||||
y_text.push_str(&format!("{} ", prompt));
|
||||
ids
|
||||
}
|
||||
};
|
||||
@ -91,7 +101,7 @@ impl Blip {
|
||||
let y = self.textual.run(&[
|
||||
input_ids_nd,
|
||||
input_ids_attn_mask,
|
||||
image_embeds.to_owned(),
|
||||
image_embeds.embedding().to_owned(),
|
||||
image_embeds_attn_mask.to_owned(),
|
||||
])?; // N, length, vocab_size
|
||||
let y = y[0].slice(s!(0, -1.., ..));
|
||||
@ -106,16 +116,20 @@ impl Blip {
|
||||
|
||||
// streaming generation
|
||||
if let Some(t) = self.tokenizer.next_token(token_id as u32)? {
|
||||
print!("{t}");
|
||||
y_text.push_str(&t);
|
||||
if show {
|
||||
print!("{t}");
|
||||
// std::thread::sleep(std::time::Duration::from_millis(5));
|
||||
}
|
||||
std::io::stdout().flush()?;
|
||||
}
|
||||
|
||||
// sleep for test
|
||||
std::thread::sleep(std::time::Duration::from_millis(5));
|
||||
}
|
||||
println!();
|
||||
if show {
|
||||
println!();
|
||||
}
|
||||
self.tokenizer.clear();
|
||||
Ok(())
|
||||
ys.push(Y::default().with_texts(&[y_text]));
|
||||
Ok(ys)
|
||||
}
|
||||
|
||||
pub fn batch_visual(&self) -> usize {
|
||||
|
@ -1,7 +1,7 @@
|
||||
use crate::{ops, MinOptMax, Options, OrtEngine};
|
||||
use crate::{ops, Embedding, MinOptMax, Options, OrtEngine};
|
||||
use anyhow::Result;
|
||||
use image::DynamicImage;
|
||||
use ndarray::{Array, Array2, Axis, IxDyn};
|
||||
use ndarray::{Array, Array2, IxDyn};
|
||||
use tokenizers::{PaddingDirection, PaddingParams, PaddingStrategy, Tokenizer};
|
||||
|
||||
#[derive(Debug)]
|
||||
@ -52,7 +52,7 @@ impl Clip {
|
||||
})
|
||||
}
|
||||
|
||||
pub fn encode_images(&self, xs: &[DynamicImage]) -> Result<Array<f32, IxDyn>> {
|
||||
pub fn encode_images(&self, xs: &[DynamicImage]) -> Result<Embedding> {
|
||||
let xs_ = ops::resize(xs, self.height.opt as u32, self.width.opt as u32)?;
|
||||
let xs_ = ops::normalize(xs_, 0.0, 255.0);
|
||||
let xs_ = ops::standardize(
|
||||
@ -61,11 +61,10 @@ impl Clip {
|
||||
&[0.26862954, 0.2613026, 0.2757771],
|
||||
);
|
||||
let ys: Vec<Array<f32, IxDyn>> = self.visual.run(&[xs_])?;
|
||||
let ys = ys[0].to_owned();
|
||||
Ok(ys)
|
||||
Ok(Embedding::new(ys[0].to_owned()))
|
||||
}
|
||||
|
||||
pub fn encode_texts(&self, texts: &[String]) -> Result<Array<f32, IxDyn>> {
|
||||
pub fn encode_texts(&self, texts: &[String]) -> Result<Embedding> {
|
||||
let encodings = self
|
||||
.tokenizer
|
||||
.encode_batch(texts.to_owned(), false)
|
||||
@ -76,23 +75,7 @@ impl Clip {
|
||||
.collect();
|
||||
let xs = Array2::from_shape_vec((texts.len(), self.context_length), xs)?.into_dyn();
|
||||
let ys = self.textual.run(&[xs])?;
|
||||
let ys = ys[0].to_owned();
|
||||
Ok(ys)
|
||||
}
|
||||
|
||||
pub fn get_similarity(
|
||||
&self,
|
||||
images_feats: &Array<f32, IxDyn>,
|
||||
texts_feats: &Array<f32, IxDyn>,
|
||||
) -> Result<Vec<Vec<f32>>> {
|
||||
let images_feats = images_feats.clone().into_dimensionality::<ndarray::Ix2>()?;
|
||||
let texts_feats = texts_feats.clone().into_dimensionality::<ndarray::Ix2>()?;
|
||||
let matrix = images_feats.dot(&texts_feats.t()); // [M, N]
|
||||
let exps = matrix.mapv(|x| x.exp()); //[M, N]
|
||||
let stds = exps.sum_axis(Axis(1)); //[M, 1]
|
||||
let matrix = exps / stds.insert_axis(Axis(1)); // [M, N]
|
||||
let similarity: Vec<Vec<f32>> = matrix.axis_iter(Axis(0)).map(|row| row.to_vec()).collect();
|
||||
Ok(similarity)
|
||||
Ok(Embedding::new(ys[0].to_owned()))
|
||||
}
|
||||
|
||||
pub fn batch_visual(&self) -> usize {
|
||||
|
@ -1,6 +1,6 @@
|
||||
use crate::{ops, Bbox, DynConf, Mask, MinOptMax, Options, OrtEngine, Polygon, Ys};
|
||||
use crate::{ops, DynConf, Mask, Mbr, MinOptMax, Options, OrtEngine, Y};
|
||||
use anyhow::Result;
|
||||
use image::{DynamicImage, ImageBuffer};
|
||||
use image::DynamicImage;
|
||||
use ndarray::{Array, Axis, IxDyn};
|
||||
|
||||
#[derive(Debug)]
|
||||
@ -44,19 +44,20 @@ impl DB {
|
||||
})
|
||||
}
|
||||
|
||||
pub fn run(&mut self, xs: &[DynamicImage]) -> Result<Vec<Ys>> {
|
||||
pub fn run(&mut self, xs: &[DynamicImage]) -> Result<Vec<Y>> {
|
||||
let xs_ = ops::letterbox(xs, self.height.opt as u32, self.width.opt as u32, 144.0)?;
|
||||
let xs_ = ops::normalize(xs_, 0.0, 255.0);
|
||||
let xs_ = ops::standardize(xs_, &[0.485, 0.456, 0.406], &[0.229, 0.224, 0.225]);
|
||||
let ys = self.engine.run(&[xs_])?;
|
||||
let ys = self.postprocess(ys, xs)?;
|
||||
Ok(ys)
|
||||
self.postprocess(ys, xs)
|
||||
}
|
||||
|
||||
pub fn postprocess(&self, xs: Vec<Array<f32, IxDyn>>, xs0: &[DynamicImage]) -> Result<Vec<Ys>> {
|
||||
pub fn postprocess(&self, xs: Vec<Array<f32, IxDyn>>, xs0: &[DynamicImage]) -> Result<Vec<Y>> {
|
||||
let mut ys = Vec::new();
|
||||
for (idx, luma) in xs[0].axis_iter(Axis(0)).enumerate() {
|
||||
let mut y_bbox = Vec::new();
|
||||
let mut y_masks: Vec<Mask> = Vec::new();
|
||||
let mut y_mbrs: Vec<Mbr> = Vec::new();
|
||||
|
||||
// reshape
|
||||
let h = luma.dim()[1];
|
||||
@ -64,15 +65,13 @@ impl DB {
|
||||
let luma = luma.into_shape((h, w, 1))?.into_owned();
|
||||
|
||||
// build image from ndarray
|
||||
let raw_vec = luma
|
||||
let v = luma
|
||||
.into_raw_vec()
|
||||
.iter()
|
||||
.map(|x| if x <= &self.binary_thresh { 0.0 } else { *x })
|
||||
.collect::<Vec<_>>();
|
||||
let mask_im: ImageBuffer<image::Luma<_>, Vec<f32>> =
|
||||
ImageBuffer::from_raw(w as u32, h as u32, raw_vec)
|
||||
.expect("Faild to create image from ndarray");
|
||||
let mut mask_im = image::DynamicImage::from(mask_im);
|
||||
let mut mask_im =
|
||||
ops::build_dyn_image_from_raw(v, self.height() as u32, self.width() as u32);
|
||||
|
||||
// input image
|
||||
let image_width = xs0[idx].width() as f32;
|
||||
@ -94,37 +93,45 @@ impl DB {
|
||||
imageproc::contours::find_contours_with_threshold(&mask_im, 1);
|
||||
|
||||
// loop
|
||||
let mut y_masks: Vec<Mask> = Vec::new();
|
||||
for contour in contours.iter() {
|
||||
if contour.points.len() <= 1 {
|
||||
if contour.border_type == imageproc::contours::BorderType::Hole
|
||||
&& contour.points.len() <= 2
|
||||
{
|
||||
continue;
|
||||
}
|
||||
let polygon = Polygon::from_imageproc_points(&contour.points);
|
||||
let perimeter = polygon.perimeter();
|
||||
let delta = polygon.area() * ratio.round() * self.unclip_ratio / perimeter;
|
||||
let polygon = polygon
|
||||
// .simplify(6e-4 * perimeter)
|
||||
.offset(delta, image_width, image_height)
|
||||
let mask = Mask::default().with_points_imageproc(&contour.points);
|
||||
let delta = mask.area() * ratio.round() as f64 * self.unclip_ratio as f64
|
||||
/ mask.perimeter();
|
||||
let mask = mask
|
||||
.unclip(delta, image_width as f64, image_height as f64)
|
||||
.resample(50)
|
||||
// .simplify(6e-4)
|
||||
.convex_hull();
|
||||
let rect = polygon.find_min_rect();
|
||||
if rect.height() < self.min_height || rect.width() < self.min_width {
|
||||
continue;
|
||||
}
|
||||
let confidence = polygon.area() / rect.area();
|
||||
if confidence < self.confs[0] {
|
||||
continue;
|
||||
}
|
||||
y_bbox.push(Bbox::new(rect, 0, confidence, None));
|
||||
y_masks.push(Mask {
|
||||
polygon,
|
||||
id: 0,
|
||||
name: None,
|
||||
});
|
||||
}
|
||||
ys.push(Ys::default().with_bboxes(&y_bbox).with_masks(&y_masks));
|
||||
}
|
||||
if let Some(bbox) = mask.bbox() {
|
||||
if bbox.height() < self.min_height || bbox.width() < self.min_width {
|
||||
continue;
|
||||
}
|
||||
let confidence = mask.area() as f32 / bbox.area();
|
||||
if confidence < self.confs[0] {
|
||||
continue;
|
||||
}
|
||||
y_bbox.push(bbox.with_confidence(confidence).with_id(0));
|
||||
|
||||
if let Some(mbr) = mask.mbr() {
|
||||
y_mbrs.push(mbr.with_confidence(confidence).with_id(0));
|
||||
}
|
||||
y_masks.push(mask.with_id(0));
|
||||
} else {
|
||||
continue;
|
||||
}
|
||||
}
|
||||
ys.push(
|
||||
Y::default()
|
||||
.with_bboxes(&y_bbox)
|
||||
.with_masks(&y_masks)
|
||||
.with_mbrs(&y_mbrs),
|
||||
);
|
||||
}
|
||||
Ok(ys)
|
||||
}
|
||||
|
||||
|
@ -15,5 +15,5 @@ pub use dinov2::Dinov2;
|
||||
pub use rtdetr::RTDETR;
|
||||
pub use rtmo::RTMO;
|
||||
pub use svtr::SVTR;
|
||||
pub use yolo::YOLO;
|
||||
pub use yolo::{YOLOTask, YOLO};
|
||||
pub use yolop::YOLOPv2;
|
||||
|
@ -3,7 +3,7 @@ use image::DynamicImage;
|
||||
use ndarray::{s, Array, Axis, IxDyn};
|
||||
use regex::Regex;
|
||||
|
||||
use crate::{ops, Bbox, DynConf, MinOptMax, Options, OrtEngine, Rect, Ys};
|
||||
use crate::{ops, Bbox, DynConf, MinOptMax, Options, OrtEngine, Y};
|
||||
|
||||
#[derive(Debug)]
|
||||
pub struct RTDETR {
|
||||
@ -55,15 +55,14 @@ impl RTDETR {
|
||||
})
|
||||
}
|
||||
|
||||
pub fn run(&mut self, xs: &[DynamicImage]) -> Result<Vec<Ys>> {
|
||||
pub fn run(&mut self, xs: &[DynamicImage]) -> Result<Vec<Y>> {
|
||||
let xs_ = ops::letterbox(xs, self.height() as u32, self.width() as u32, 144.0)?;
|
||||
let xs_ = ops::normalize(xs_, 0.0, 255.0);
|
||||
let ys = self.engine.run(&[xs_])?;
|
||||
let ys = self.postprocess(ys, xs)?;
|
||||
Ok(ys)
|
||||
self.postprocess(ys, xs)
|
||||
}
|
||||
|
||||
pub fn postprocess(&self, xs: Vec<Array<f32, IxDyn>>, xs0: &[DynamicImage]) -> Result<Vec<Ys>> {
|
||||
pub fn postprocess(&self, xs: Vec<Array<f32, IxDyn>>, xs0: &[DynamicImage]) -> Result<Vec<Y>> {
|
||||
const CXYWH_OFFSET: usize = 4; // cxcywh
|
||||
let preds = &xs[0];
|
||||
|
||||
@ -98,20 +97,20 @@ impl RTDETR {
|
||||
let y = (bbox[1] - bbox[3] / 2.) * self.height() as f32 / ratio;
|
||||
let w = bbox[2] * self.width() as f32 / ratio;
|
||||
let h = bbox[3] * self.height() as f32 / ratio;
|
||||
let y_bbox = Bbox::new(
|
||||
Rect::from_xywh(
|
||||
x.max(0.0f32).min(width_original),
|
||||
y.max(0.0f32).min(height_original),
|
||||
w,
|
||||
h,
|
||||
),
|
||||
id,
|
||||
confidence,
|
||||
self.names.as_ref().map(|names| names[id].clone()),
|
||||
);
|
||||
y_bboxes.push(y_bbox)
|
||||
y_bboxes.push(
|
||||
Bbox::default()
|
||||
.with_xywh(
|
||||
x.max(0.0f32).min(width_original),
|
||||
y.max(0.0f32).min(height_original),
|
||||
w,
|
||||
h,
|
||||
)
|
||||
.with_confidence(confidence)
|
||||
.with_id(id as isize)
|
||||
.with_name(self.names.as_ref().map(|names| names[id].to_owned())),
|
||||
)
|
||||
}
|
||||
ys.push(Ys::default().with_bboxes(&y_bboxes));
|
||||
ys.push(Y::default().with_bboxes(&y_bboxes));
|
||||
}
|
||||
Ok(ys)
|
||||
}
|
||||
|
@ -2,7 +2,7 @@ use anyhow::Result;
|
||||
use image::DynamicImage;
|
||||
use ndarray::{Array, Axis, IxDyn};
|
||||
|
||||
use crate::{ops, Bbox, DynConf, Keypoint, MinOptMax, Options, OrtEngine, Ys};
|
||||
use crate::{ops, Bbox, DynConf, Keypoint, MinOptMax, Options, OrtEngine, Y};
|
||||
|
||||
#[derive(Debug)]
|
||||
pub struct RTMO {
|
||||
@ -38,15 +38,14 @@ impl RTMO {
|
||||
})
|
||||
}
|
||||
|
||||
pub fn run(&mut self, xs: &[DynamicImage]) -> Result<Vec<Ys>> {
|
||||
pub fn run(&mut self, xs: &[DynamicImage]) -> Result<Vec<Y>> {
|
||||
let xs_ = ops::letterbox(xs, self.height() as u32, self.width() as u32, 114.0)?;
|
||||
let ys = self.engine.run(&[xs_])?;
|
||||
let ys = self.postprocess(ys, xs)?;
|
||||
Ok(ys)
|
||||
self.postprocess(ys, xs)
|
||||
}
|
||||
|
||||
pub fn postprocess(&self, xs: Vec<Array<f32, IxDyn>>, xs0: &[DynamicImage]) -> Result<Vec<Ys>> {
|
||||
let mut ys: Vec<Ys> = Vec::new();
|
||||
pub fn postprocess(&self, xs: Vec<Array<f32, IxDyn>>, xs0: &[DynamicImage]) -> Result<Vec<Y>> {
|
||||
let mut ys: Vec<Y> = Vec::new();
|
||||
let (preds_bboxes, preds_kpts) = if xs[0].ndim() == 3 {
|
||||
(&xs[0], &xs[1])
|
||||
} else {
|
||||
@ -78,20 +77,18 @@ impl RTMO {
|
||||
if confidence < self.confs[0] {
|
||||
continue;
|
||||
}
|
||||
let y_bbox = Bbox::new(
|
||||
(
|
||||
(
|
||||
y_bboxes.push(
|
||||
Bbox::default()
|
||||
.with_xyxy(
|
||||
x1.max(0.0f32).min(width_original),
|
||||
y1.max(0.0f32).min(height_original),
|
||||
),
|
||||
(x2, y2),
|
||||
)
|
||||
.into(),
|
||||
0,
|
||||
confidence,
|
||||
Some(String::from("Person")),
|
||||
x2,
|
||||
y2,
|
||||
)
|
||||
.with_confidence(confidence)
|
||||
.with_id(0isize)
|
||||
.with_name(Some(String::from("Person"))),
|
||||
);
|
||||
y_bboxes.push(y_bbox);
|
||||
|
||||
// keypoints
|
||||
let mut kpts_ = Vec::new();
|
||||
@ -102,21 +99,20 @@ impl RTMO {
|
||||
if c < self.kconfs[i] {
|
||||
kpts_.push(Keypoint::default());
|
||||
} else {
|
||||
kpts_.push(Keypoint::new(
|
||||
(
|
||||
x.max(0.0f32).min(width_original),
|
||||
y.max(0.0f32).min(height_original),
|
||||
)
|
||||
.into(),
|
||||
c,
|
||||
i as isize,
|
||||
None, // Name
|
||||
));
|
||||
kpts_.push(
|
||||
Keypoint::default()
|
||||
.with_id(i as isize)
|
||||
.with_confidence(c)
|
||||
.with_xy(
|
||||
x.max(0.0f32).min(width_original),
|
||||
y.max(0.0f32).min(height_original),
|
||||
),
|
||||
);
|
||||
}
|
||||
}
|
||||
y_kpts.push(kpts_);
|
||||
}
|
||||
ys.push(Ys::default().with_bboxes(&y_bboxes).with_keypoints(&y_kpts));
|
||||
ys.push(Y::default().with_bboxes(&y_bboxes).with_keypoints(&y_kpts));
|
||||
}
|
||||
Ok(ys)
|
||||
}
|
||||
|
@ -1,8 +1,9 @@
|
||||
use crate::{ops, DynConf, MinOptMax, Options, OrtEngine};
|
||||
use anyhow::Result;
|
||||
use image::DynamicImage;
|
||||
use ndarray::{Array, Axis, IxDyn};
|
||||
|
||||
use crate::{ops, DynConf, MinOptMax, Options, OrtEngine, Y};
|
||||
|
||||
#[derive(Debug)]
|
||||
pub struct SVTR {
|
||||
engine: OrtEngine,
|
||||
@ -41,18 +42,17 @@ impl SVTR {
|
||||
})
|
||||
}
|
||||
|
||||
pub fn run(&mut self, xs: &[DynamicImage]) -> Result<Vec<String>> {
|
||||
pub fn run(&mut self, xs: &[DynamicImage]) -> Result<Vec<Y>> {
|
||||
let xs_ =
|
||||
ops::resize_with_fixed_height(xs, self.height.opt as u32, self.width.opt as u32, 0.0)?;
|
||||
let xs_ = ops::normalize(xs_, 0.0, 255.0);
|
||||
let ys: Vec<Array<f32, IxDyn>> = self.engine.run(&[xs_])?;
|
||||
let ys = ys[0].to_owned();
|
||||
|
||||
self.postprocess(&ys)
|
||||
}
|
||||
|
||||
pub fn postprocess(&self, output: &Array<f32, IxDyn>) -> Result<Vec<String>> {
|
||||
let mut texts: Vec<String> = Vec::new();
|
||||
pub fn postprocess(&self, output: &Array<f32, IxDyn>) -> Result<Vec<Y>> {
|
||||
let mut ys: Vec<Y> = Vec::new();
|
||||
for batch in output.axis_iter(Axis(0)) {
|
||||
let preds = batch
|
||||
.axis_iter(Axis(0))
|
||||
@ -72,7 +72,6 @@ impl SVTR {
|
||||
}
|
||||
|
||||
if idx == 0 || idx == self.vocab.len() - 1 {
|
||||
text_ids.push(*text_id);
|
||||
return text_ids;
|
||||
}
|
||||
|
||||
@ -85,9 +84,9 @@ impl SVTR {
|
||||
.map(|idx| self.vocab[idx].to_owned())
|
||||
.collect::<String>();
|
||||
|
||||
texts.push(text);
|
||||
ys.push(Y::default().with_texts(&[text]))
|
||||
}
|
||||
|
||||
Ok(texts)
|
||||
Ok(ys)
|
||||
}
|
||||
}
|
||||
|
@ -4,20 +4,18 @@ use image::DynamicImage;
|
||||
use ndarray::{s, Array, Axis, IxDyn};
|
||||
use regex::Regex;
|
||||
|
||||
use crate::{
|
||||
ops, Bbox, DynConf, Embedding, Keypoint, Mask, MinOptMax, Options, OrtEngine, Point, Rect, Ys,
|
||||
};
|
||||
use crate::{ops, Bbox, DynConf, Keypoint, Mask, Mbr, MinOptMax, Options, OrtEngine, Prob, Y};
|
||||
|
||||
const CXYWH_OFFSET: usize = 4;
|
||||
const KPT_STEP: usize = 3;
|
||||
|
||||
#[derive(Debug, Clone, ValueEnum)]
|
||||
enum YOLOTask {
|
||||
pub enum YOLOTask {
|
||||
Classify,
|
||||
Detect,
|
||||
Pose,
|
||||
Segment,
|
||||
Obb, // TODO
|
||||
Obb,
|
||||
}
|
||||
|
||||
#[derive(Debug)]
|
||||
@ -37,6 +35,8 @@ pub struct YOLO {
|
||||
names_kpt: Option<Vec<String>>,
|
||||
apply_nms: bool,
|
||||
anchors_first: bool,
|
||||
conf_independent: bool,
|
||||
apply_probs_softmax: bool,
|
||||
}
|
||||
|
||||
impl YOLO {
|
||||
@ -47,16 +47,21 @@ impl YOLO {
|
||||
engine.height().to_owned(),
|
||||
engine.width().to_owned(),
|
||||
);
|
||||
let task = match engine
|
||||
.try_fetch("task")
|
||||
.unwrap_or("detect".to_string())
|
||||
.as_str()
|
||||
{
|
||||
"classify" => YOLOTask::Classify,
|
||||
"detect" => YOLOTask::Detect,
|
||||
"pose" => YOLOTask::Pose,
|
||||
"segment" => YOLOTask::Segment,
|
||||
x => todo!("{:?} is not supported for now!", x),
|
||||
|
||||
let task = match &options.yolo_task {
|
||||
Some(task) => task.to_owned(),
|
||||
None => match engine
|
||||
.try_fetch("task")
|
||||
.unwrap_or("detect".to_string())
|
||||
.as_str()
|
||||
{
|
||||
"classify" => YOLOTask::Classify,
|
||||
"detect" => YOLOTask::Detect,
|
||||
"pose" => YOLOTask::Pose,
|
||||
"segment" => YOLOTask::Segment,
|
||||
"obb" => YOLOTask::Obb,
|
||||
x => todo!("{:?} is not supported for now!", x),
|
||||
},
|
||||
};
|
||||
|
||||
// try from custom class names, and then model metadata
|
||||
@ -119,219 +124,275 @@ impl YOLO {
|
||||
names,
|
||||
names_kpt,
|
||||
anchors_first: options.anchors_first,
|
||||
conf_independent: options.conf_independent,
|
||||
apply_probs_softmax: options.apply_probs_softmax,
|
||||
})
|
||||
}
|
||||
|
||||
pub fn run(&mut self, xs: &[DynamicImage]) -> Result<Vec<Ys>> {
|
||||
let xs_ = ops::letterbox(xs, self.height() as u32, self.width() as u32, 144.0)?;
|
||||
pub fn run(&mut self, xs: &[DynamicImage]) -> Result<Vec<Y>> {
|
||||
let xs_ = match self.task {
|
||||
YOLOTask::Classify => ops::resize(xs, self.height() as u32, self.width() as u32)?,
|
||||
_ => ops::letterbox(xs, self.height() as u32, self.width() as u32, 114.0)?,
|
||||
};
|
||||
let xs_ = ops::normalize(xs_, 0.0, 255.0);
|
||||
let ys = self.engine.run(&[xs_])?;
|
||||
let ys = self.postprocess(ys, xs)?;
|
||||
Ok(ys)
|
||||
self.postprocess(ys, xs)
|
||||
}
|
||||
|
||||
pub fn postprocess(&self, xs: Vec<Array<f32, IxDyn>>, xs0: &[DynamicImage]) -> Result<Vec<Ys>> {
|
||||
if let YOLOTask::Classify = self.task {
|
||||
let mut ys = Vec::new();
|
||||
for batch in xs[0].axis_iter(Axis(0)) {
|
||||
ys.push(
|
||||
Ys::default()
|
||||
.with_probs(Embedding::new(batch.into_owned(), self.names.to_owned())),
|
||||
);
|
||||
}
|
||||
Ok(ys)
|
||||
} else {
|
||||
let (preds, protos) = if xs.len() == 2 {
|
||||
if xs[0].ndim() == 3 {
|
||||
(&xs[0], Some(&xs[1]))
|
||||
} else {
|
||||
(&xs[1], Some(&xs[0]))
|
||||
}
|
||||
} else {
|
||||
(&xs[0], None)
|
||||
};
|
||||
pub fn postprocess(&self, xs: Vec<Array<f32, IxDyn>>, xs0: &[DynamicImage]) -> Result<Vec<Y>> {
|
||||
let mut ys = Vec::new();
|
||||
let protos = if xs.len() == 2 { Some(&xs[1]) } else { None };
|
||||
for (idx, preds) in xs[0].axis_iter(Axis(0)).enumerate() {
|
||||
let image_width = xs0[idx].width() as f32;
|
||||
let image_height = xs0[idx].height() as f32;
|
||||
|
||||
let mut ys = Vec::new();
|
||||
for (idx, anchor) in preds.axis_iter(Axis(0)).enumerate() {
|
||||
let width_original = xs0[idx].width() as f32;
|
||||
let height_original = xs0[idx].height() as f32;
|
||||
let ratio = (self.width() as f32 / width_original)
|
||||
.min(self.height() as f32 / height_original);
|
||||
|
||||
#[allow(clippy::type_complexity)]
|
||||
let mut data: Vec<(Bbox, Option<Vec<Keypoint>>, Option<Vec<f32>>)> = Vec::new();
|
||||
for pred in anchor.axis_iter(if self.anchors_first { Axis(0) } else { Axis(1) }) {
|
||||
// split preds for different tasks
|
||||
let bbox = pred.slice(s![0..CXYWH_OFFSET]);
|
||||
let clss = pred.slice(s![CXYWH_OFFSET..CXYWH_OFFSET + self.nc]);
|
||||
let kpts = {
|
||||
if let YOLOTask::Pose = self.task {
|
||||
Some(pred.slice(s![pred.len() - KPT_STEP * self.nk..]))
|
||||
} else {
|
||||
None
|
||||
}
|
||||
};
|
||||
let coefs = {
|
||||
if let YOLOTask::Segment = self.task {
|
||||
Some(pred.slice(s![pred.len() - self.nm..]).to_vec())
|
||||
} else {
|
||||
None
|
||||
}
|
||||
// decode
|
||||
match self.task {
|
||||
YOLOTask::Classify => {
|
||||
let y = if self.apply_probs_softmax {
|
||||
let exps = preds.mapv(|x| x.exp());
|
||||
let stds = exps.sum_axis(Axis(0));
|
||||
exps / stds
|
||||
} else {
|
||||
preds.into_owned()
|
||||
};
|
||||
|
||||
// confidence and index
|
||||
let (id, &confidence) = clss
|
||||
.into_iter()
|
||||
.enumerate()
|
||||
.reduce(|max, x| if x.1 > max.1 { x } else { max })
|
||||
.unwrap();
|
||||
|
||||
// confidence filter
|
||||
if confidence < self.confs[id] {
|
||||
continue;
|
||||
}
|
||||
|
||||
// bbox re-scale
|
||||
let cx = bbox[0] / ratio;
|
||||
let cy = bbox[1] / ratio;
|
||||
let w = bbox[2] / ratio;
|
||||
let h = bbox[3] / ratio;
|
||||
let x = cx - w / 2.;
|
||||
let y = cy - h / 2.;
|
||||
let y_bbox = Bbox::new(
|
||||
Rect::from_xywh(
|
||||
x.max(0.0f32).min(width_original),
|
||||
y.max(0.0f32).min(height_original),
|
||||
w,
|
||||
h,
|
||||
ys.push(
|
||||
Y::default().with_probs(
|
||||
Prob::default()
|
||||
.with_probs(&y.into_raw_vec())
|
||||
.with_names(self.names.to_owned()),
|
||||
),
|
||||
id,
|
||||
confidence,
|
||||
self.names.as_ref().map(|names| names[id].to_owned()),
|
||||
);
|
||||
}
|
||||
YOLOTask::Obb => {
|
||||
let mut y_mbrs: Vec<Mbr> = Vec::new();
|
||||
let ratio = (self.width() as f32 / image_width)
|
||||
.min(self.height() as f32 / image_height);
|
||||
for pred in preds.axis_iter(if self.anchors_first { Axis(0) } else { Axis(1) })
|
||||
{
|
||||
// xywhclsr
|
||||
let xywh = pred.slice(s![0..CXYWH_OFFSET]);
|
||||
let clss = pred.slice(s![CXYWH_OFFSET..CXYWH_OFFSET + self.nc]);
|
||||
let radians = pred[pred.len() - 1];
|
||||
let (id, &confidence) = clss
|
||||
.into_iter()
|
||||
.enumerate()
|
||||
.max_by(|a, b| a.1.total_cmp(b.1))
|
||||
.unwrap();
|
||||
if confidence < self.confs[id] {
|
||||
continue;
|
||||
}
|
||||
|
||||
// kpts
|
||||
let y_kpts = {
|
||||
if let Some(kpts) = kpts {
|
||||
let mut kpts_ = Vec::new();
|
||||
for i in 0..self.nk {
|
||||
let kx = kpts[KPT_STEP * i] / ratio;
|
||||
let ky = kpts[KPT_STEP * i + 1] / ratio;
|
||||
let kconf = kpts[KPT_STEP * i + 2];
|
||||
if kconf < self.kconfs[i] {
|
||||
kpts_.push(Keypoint::default());
|
||||
} else {
|
||||
kpts_.push(Keypoint::new(
|
||||
Point::new(
|
||||
kx.max(0.0f32).min(width_original),
|
||||
ky.max(0.0f32).min(height_original),
|
||||
),
|
||||
kconf,
|
||||
i as isize,
|
||||
self.names_kpt.as_ref().map(|names| names[i].to_owned()),
|
||||
));
|
||||
}
|
||||
}
|
||||
Some(kpts_)
|
||||
// re-scale
|
||||
let cx = xywh[0] / ratio;
|
||||
let cy = xywh[1] / ratio;
|
||||
let w = xywh[2] / ratio;
|
||||
let h = xywh[3] / ratio;
|
||||
let (w, h, radians) = if w > h {
|
||||
(w, h, radians)
|
||||
} else {
|
||||
None
|
||||
(h, w, radians + std::f32::consts::PI / 2.)
|
||||
};
|
||||
let radians = radians % std::f32::consts::PI;
|
||||
y_mbrs.push(
|
||||
Mbr::from_cxcywhr(
|
||||
cx as f64,
|
||||
cy as f64,
|
||||
w as f64,
|
||||
h as f64,
|
||||
radians as f64,
|
||||
)
|
||||
.with_confidence(confidence)
|
||||
.with_id(id as isize)
|
||||
.with_name(self.names.as_ref().map(|names| names[id].to_owned())),
|
||||
);
|
||||
}
|
||||
ys.push(Y::default().with_mbrs(&y_mbrs).apply_mbrs_nms(self.iou));
|
||||
}
|
||||
_ => {
|
||||
let mut y_bboxes: Vec<Bbox> = Vec::new();
|
||||
let ratio = (self.width() as f32 / image_width)
|
||||
.min(self.height() as f32 / image_height);
|
||||
|
||||
// bboxes
|
||||
for (i, pred) in preds
|
||||
.axis_iter(if self.anchors_first { Axis(0) } else { Axis(1) })
|
||||
.enumerate()
|
||||
{
|
||||
let bbox = pred.slice(s![0..CXYWH_OFFSET]);
|
||||
let (conf_, clss) = if self.conf_independent {
|
||||
(
|
||||
pred[CXYWH_OFFSET],
|
||||
pred.slice(s![CXYWH_OFFSET + 1..CXYWH_OFFSET + self.nc + 1]),
|
||||
)
|
||||
} else {
|
||||
(1.0, pred.slice(s![CXYWH_OFFSET..CXYWH_OFFSET + self.nc]))
|
||||
};
|
||||
let (id, &confidence) = clss
|
||||
.into_iter()
|
||||
.enumerate()
|
||||
.max_by(|a, b| a.1.total_cmp(b.1))
|
||||
.unwrap();
|
||||
let confidence = confidence * conf_;
|
||||
if confidence < self.confs[id] {
|
||||
continue;
|
||||
}
|
||||
};
|
||||
|
||||
// merged
|
||||
data.push((y_bbox, y_kpts, coefs));
|
||||
}
|
||||
|
||||
// nms
|
||||
if self.apply_nms {
|
||||
Self::non_max_suppression(&mut data, self.iou);
|
||||
}
|
||||
|
||||
// decode
|
||||
let mut y_bboxes: Vec<Bbox> = Vec::new();
|
||||
let mut y_kpts: Vec<Vec<Keypoint>> = Vec::new();
|
||||
let mut y_masks: Vec<Mask> = Vec::new();
|
||||
for elem in data.into_iter() {
|
||||
if let Some(kpts) = elem.1 {
|
||||
y_kpts.push(kpts)
|
||||
// re-scale
|
||||
let cx = bbox[0] / ratio;
|
||||
let cy = bbox[1] / ratio;
|
||||
let w = bbox[2] / ratio;
|
||||
let h = bbox[3] / ratio;
|
||||
let x = cx - w / 2.;
|
||||
let y = cy - h / 2.;
|
||||
let x = x.max(0.0).min(image_width);
|
||||
let y = y.max(0.0).min(image_height);
|
||||
let y_bbox = Bbox::default()
|
||||
.with_xywh(x, y, w, h)
|
||||
.with_confidence(confidence)
|
||||
.with_id(id as isize)
|
||||
.with_id_born(i as isize)
|
||||
.with_name(self.names.as_ref().map(|names| names[id].to_owned()));
|
||||
y_bboxes.push(y_bbox);
|
||||
}
|
||||
|
||||
// decode masks
|
||||
if let Some(coefs) = elem.2 {
|
||||
let proto = protos.unwrap().slice(s![idx, .., .., ..]);
|
||||
let (nm, nh, nw) = proto.dim();
|
||||
// nms
|
||||
let mut y = Y::default().with_bboxes(&y_bboxes);
|
||||
if self.apply_nms {
|
||||
y = y.apply_bboxes_nms(self.iou);
|
||||
}
|
||||
|
||||
// coefs * proto -> mask
|
||||
let coefs = Array::from_shape_vec((1, nm), coefs)?; // (n, nm)
|
||||
let proto = proto.to_owned().into_shape((nm, nh * nw))?; // (nm, nh*nw)
|
||||
let mask = coefs.dot(&proto).into_shape((nh, nw, 1))?; // (nh, nw, n)
|
||||
// keypoints
|
||||
if let YOLOTask::Pose = self.task {
|
||||
if let Some(bboxes) = y.bboxes() {
|
||||
let mut y_kpts: Vec<Vec<Keypoint>> = Vec::new();
|
||||
for bbox in bboxes.iter() {
|
||||
let pred = if self.anchors_first {
|
||||
preds.slice(s![
|
||||
bbox.id_born(),
|
||||
preds.shape()[1] - KPT_STEP * self.nk..,
|
||||
])
|
||||
} else {
|
||||
preds.slice(s![
|
||||
preds.shape()[0] - KPT_STEP * self.nk..,
|
||||
bbox.id_born(),
|
||||
])
|
||||
};
|
||||
|
||||
// build image from ndarray
|
||||
let mask_im = ops::build_dyn_image_from_raw(
|
||||
mask.into_raw_vec(),
|
||||
nw as u32,
|
||||
nh as u32,
|
||||
);
|
||||
|
||||
// rescale masks
|
||||
let mask_original = ops::descale_mask(
|
||||
mask_im,
|
||||
nw as f32,
|
||||
nh as f32,
|
||||
width_original,
|
||||
height_original,
|
||||
);
|
||||
|
||||
// crop mask with bbox
|
||||
let mut mask_original = mask_original.into_luma8();
|
||||
for y in 0..height_original as usize {
|
||||
for x in 0..width_original as usize {
|
||||
if x < elem.0.xmin() as usize
|
||||
|| x > elem.0.xmax() as usize
|
||||
|| y < elem.0.ymin() as usize
|
||||
|| y > elem.0.ymax() as usize
|
||||
{
|
||||
mask_original.put_pixel(x as u32, y as u32, image::Luma([0u8]));
|
||||
let mut kpts_: Vec<Keypoint> = Vec::new();
|
||||
for i in 0..self.nk {
|
||||
let kx = pred[KPT_STEP * i] / ratio;
|
||||
let ky = pred[KPT_STEP * i + 1] / ratio;
|
||||
let kconf = pred[KPT_STEP * i + 2];
|
||||
if kconf < self.kconfs[i] {
|
||||
kpts_.push(Keypoint::default());
|
||||
} else {
|
||||
kpts_.push(
|
||||
Keypoint::default()
|
||||
.with_id(i as isize)
|
||||
.with_confidence(kconf)
|
||||
.with_name(
|
||||
self.names_kpt
|
||||
.as_ref()
|
||||
.map(|names| names[i].to_owned()),
|
||||
)
|
||||
.with_xy(
|
||||
kx.max(0.0f32).min(image_width),
|
||||
ky.max(0.0f32).min(image_height),
|
||||
),
|
||||
);
|
||||
}
|
||||
}
|
||||
y_kpts.push(kpts_);
|
||||
}
|
||||
y = y.with_keypoints(&y_kpts);
|
||||
}
|
||||
|
||||
// get masks from image
|
||||
let masks = ops::get_masks_from_image(
|
||||
mask_original,
|
||||
1,
|
||||
elem.0.id(),
|
||||
elem.0.name().cloned(),
|
||||
);
|
||||
y_masks.extend(masks);
|
||||
}
|
||||
y_bboxes.push(elem.0);
|
||||
|
||||
// masks
|
||||
if let YOLOTask::Segment = self.task {
|
||||
if let Some(bboxes) = y.bboxes() {
|
||||
let mut y_masks: Vec<Mask> = Vec::new();
|
||||
for bbox in bboxes.iter() {
|
||||
let coefs = if self.anchors_first {
|
||||
preds
|
||||
.slice(s![bbox.id_born(), preds.shape()[1] - self.nm..])
|
||||
.to_vec()
|
||||
} else {
|
||||
preds
|
||||
.slice(s![preds.shape()[0] - self.nm.., bbox.id_born()])
|
||||
.to_vec()
|
||||
};
|
||||
let proto = protos.unwrap().slice(s![idx, .., .., ..]);
|
||||
|
||||
// coefs * proto -> mask
|
||||
let (nm, nh, nw) = proto.dim();
|
||||
let coefs = Array::from_shape_vec((1, nm), coefs)?; // (n, nm)
|
||||
let proto = proto.to_owned().into_shape((nm, nh * nw))?; // (nm, nh*nw)
|
||||
let mask = coefs.dot(&proto).into_shape((nh, nw, 1))?; // (nh, nw, n)
|
||||
|
||||
// build image from ndarray
|
||||
let mask_im = ops::build_dyn_image_from_raw(
|
||||
mask.into_raw_vec(),
|
||||
nw as u32,
|
||||
nh as u32,
|
||||
);
|
||||
|
||||
// rescale masks
|
||||
let mask_original = ops::descale_mask(
|
||||
mask_im,
|
||||
nw as f32,
|
||||
nh as f32,
|
||||
image_width,
|
||||
image_height,
|
||||
);
|
||||
|
||||
// crop mask with bbox
|
||||
let mut mask_original = mask_original.into_luma8();
|
||||
for y in 0..image_height as usize {
|
||||
for x in 0..image_width as usize {
|
||||
if x < bbox.xmin() as usize
|
||||
|| x > bbox.xmax() as usize
|
||||
|| y < bbox.ymin() as usize
|
||||
|| y > bbox.ymax() as usize
|
||||
{
|
||||
mask_original.put_pixel(
|
||||
x as u32,
|
||||
y as u32,
|
||||
image::Luma([0u8]),
|
||||
);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// get masks from image
|
||||
let mut masks: Vec<Mask> = Vec::new();
|
||||
let contours: Vec<imageproc::contours::Contour<i32>> =
|
||||
imageproc::contours::find_contours_with_threshold(
|
||||
&mask_original,
|
||||
1,
|
||||
);
|
||||
contours.iter().for_each(|contour| {
|
||||
if contour.points.len() > 2 {
|
||||
masks.push(
|
||||
Mask::default()
|
||||
.with_id(bbox.id())
|
||||
.with_points_imageproc(&contour.points)
|
||||
.with_name(bbox.name().cloned()),
|
||||
);
|
||||
}
|
||||
});
|
||||
y_masks.extend(masks);
|
||||
}
|
||||
y = y.with_masks(&y_masks);
|
||||
}
|
||||
}
|
||||
ys.push(y);
|
||||
}
|
||||
|
||||
// save result
|
||||
ys.push(
|
||||
Ys::default()
|
||||
.with_bboxes(&y_bboxes)
|
||||
.with_keypoints(&y_kpts)
|
||||
.with_masks(&y_masks),
|
||||
);
|
||||
}
|
||||
|
||||
Ok(ys)
|
||||
}
|
||||
}
|
||||
|
||||
fn fetch_names(engine: &OrtEngine) -> Option<Vec<String>> {
|
||||
// fetch class names from onnx metadata
|
||||
// String format: `{0: 'person', 1: 'bicycle', 2: 'sports ball', ..., 27: "yellow_lady's_slipper"}`
|
||||
engine.try_fetch("names").map(|names| {
|
||||
let re = Regex::new(r#"(['"])([-()\w '"]+)(['"])"#).unwrap();
|
||||
let mut names_ = vec![];
|
||||
for (_, [_, name, _]) in re.captures_iter(&names).map(|x| x.extract()) {
|
||||
names_.push(name.to_string());
|
||||
}
|
||||
names_
|
||||
})
|
||||
Ok(ys)
|
||||
}
|
||||
|
||||
pub fn batch(&self) -> isize {
|
||||
@ -346,28 +407,16 @@ impl YOLO {
|
||||
self.height.opt
|
||||
}
|
||||
|
||||
#[allow(clippy::type_complexity)]
|
||||
fn non_max_suppression(
|
||||
xs: &mut Vec<(Bbox, Option<Vec<Keypoint>>, Option<Vec<f32>>)>,
|
||||
iou_threshold: f32,
|
||||
) {
|
||||
xs.sort_by(|b1, b2| b2.0.confidence().partial_cmp(&b1.0.confidence()).unwrap());
|
||||
|
||||
let mut current_index = 0;
|
||||
for index in 0..xs.len() {
|
||||
let mut drop = false;
|
||||
for prev_index in 0..current_index {
|
||||
let iou = xs[prev_index].0.iou(&xs[index].0);
|
||||
if iou > iou_threshold {
|
||||
drop = true;
|
||||
break;
|
||||
}
|
||||
fn fetch_names(engine: &OrtEngine) -> Option<Vec<String>> {
|
||||
// fetch class names from onnx metadata
|
||||
// String format: `{0: 'person', 1: 'bicycle', 2: 'sports ball', ..., 27: "yellow_lady's_slipper"}`
|
||||
engine.try_fetch("names").map(|names| {
|
||||
let re = Regex::new(r#"(['"])([-()\w '"]+)(['"])"#).unwrap();
|
||||
let mut names_ = vec![];
|
||||
for (_, [_, name, _]) in re.captures_iter(&names).map(|x| x.extract()) {
|
||||
names_.push(name.to_string());
|
||||
}
|
||||
if !drop {
|
||||
xs.swap(current_index, index);
|
||||
current_index += 1;
|
||||
}
|
||||
}
|
||||
xs.truncate(current_index);
|
||||
names_
|
||||
})
|
||||
}
|
||||
}
|
||||
|
@ -2,7 +2,7 @@ use anyhow::Result;
|
||||
use image::DynamicImage;
|
||||
use ndarray::{s, Array, Axis, IxDyn};
|
||||
|
||||
use crate::{ops, Bbox, DynConf, MinOptMax, Options, OrtEngine, Rect, Ys};
|
||||
use crate::{ops, Bbox, DynConf, Mask, MinOptMax, Options, OrtEngine, Y};
|
||||
|
||||
#[derive(Debug)]
|
||||
pub struct YOLOPv2 {
|
||||
@ -36,17 +36,16 @@ impl YOLOPv2 {
|
||||
})
|
||||
}
|
||||
|
||||
pub fn run(&mut self, xs: &[DynamicImage]) -> Result<Vec<Ys>> {
|
||||
pub fn run(&mut self, xs: &[DynamicImage]) -> Result<Vec<Y>> {
|
||||
let xs_ = ops::letterbox(xs, self.height() as u32, self.width() as u32, 114.0)?;
|
||||
let xs_ = ops::normalize(xs_, 0.0, 255.0);
|
||||
let ys = self.engine.run(&[xs_])?;
|
||||
let ys = self.postprocess(ys, xs)?;
|
||||
Ok(ys)
|
||||
self.postprocess(ys, xs)
|
||||
}
|
||||
|
||||
pub fn postprocess(&self, xs: Vec<Array<f32, IxDyn>>, xs0: &[DynamicImage]) -> Result<Vec<Ys>> {
|
||||
pub fn postprocess(&self, xs: Vec<Array<f32, IxDyn>>, xs0: &[DynamicImage]) -> Result<Vec<Y>> {
|
||||
let mut ys: Vec<Y> = Vec::new();
|
||||
let (xs_da, xs_ll, xs_det) = (&xs[0], &xs[1], &xs[2]);
|
||||
let mut ys: Vec<Ys> = Vec::new();
|
||||
for (idx, ((x_det, x_ll), x_da)) in xs_det
|
||||
.axis_iter(Axis(0))
|
||||
.zip(xs_ll.axis_iter(Axis(0)))
|
||||
@ -63,7 +62,7 @@ impl YOLOPv2 {
|
||||
);
|
||||
|
||||
// Vehicle
|
||||
let mut ys_bbox = Vec::new();
|
||||
let mut y_bboxes = Vec::new();
|
||||
for x in x_det.axis_iter(Axis(0)) {
|
||||
let bbox = x.slice(s![0..4]);
|
||||
let clss = x.slice(s![5..]).to_owned();
|
||||
@ -83,19 +82,15 @@ impl YOLOPv2 {
|
||||
let h = bbox[3] / ratio;
|
||||
let x = cx - w / 2.;
|
||||
let y = cy - h / 2.;
|
||||
ys_bbox.push(Bbox::new(
|
||||
Rect::from_xywh(
|
||||
x.max(0.0f32).min(image_width),
|
||||
y.max(0.0f32).min(image_height),
|
||||
w,
|
||||
h,
|
||||
),
|
||||
id,
|
||||
conf,
|
||||
None,
|
||||
));
|
||||
let x = x.max(0.0).min(image_width);
|
||||
let y = y.max(0.0).min(image_height);
|
||||
y_bboxes.push(
|
||||
Bbox::default()
|
||||
.with_xywh(x, y, w, h)
|
||||
.with_confidence(conf)
|
||||
.with_id(id as isize),
|
||||
);
|
||||
}
|
||||
Ys::non_max_suppression(&mut ys_bbox, self.iou);
|
||||
|
||||
// Drivable area
|
||||
let x_da_0 = x_da.slice(s![0, .., ..]).to_owned();
|
||||
@ -119,8 +114,21 @@ impl YOLOPv2 {
|
||||
image_height,
|
||||
);
|
||||
let mask_da = mask_da.into_luma8();
|
||||
let mut y_masks =
|
||||
ops::get_masks_from_image(mask_da, 1, 0, Some("Drivable area".to_string()));
|
||||
let mut y_masks: Vec<Mask> = Vec::new();
|
||||
let contours: Vec<imageproc::contours::Contour<i32>> =
|
||||
imageproc::contours::find_contours_with_threshold(&mask_da, 1);
|
||||
contours.iter().for_each(|contour| {
|
||||
if contour.border_type == imageproc::contours::BorderType::Outer
|
||||
&& contour.points.len() > 2
|
||||
{
|
||||
y_masks.push(
|
||||
Mask::default()
|
||||
.with_id(0)
|
||||
.with_points_imageproc(&contour.points)
|
||||
.with_name(Some("Drivable area".to_string())),
|
||||
);
|
||||
}
|
||||
});
|
||||
|
||||
// Lane line
|
||||
let x_ll = x_ll
|
||||
@ -141,9 +149,30 @@ impl YOLOPv2 {
|
||||
image_height,
|
||||
);
|
||||
let mask_ll = mask_ll.into_luma8();
|
||||
let masks = ops::get_masks_from_image(mask_ll, 1, 5, Some("Lane line".to_string()));
|
||||
let contours: Vec<imageproc::contours::Contour<i32>> =
|
||||
imageproc::contours::find_contours_with_threshold(&mask_ll, 1);
|
||||
let mut masks: Vec<Mask> = Vec::new();
|
||||
contours.iter().for_each(|contour| {
|
||||
if contour.border_type == imageproc::contours::BorderType::Outer
|
||||
&& contour.points.len() > 2
|
||||
{
|
||||
masks.push(
|
||||
Mask::default()
|
||||
.with_id(1)
|
||||
.with_points_imageproc(&contour.points)
|
||||
.with_name(Some("Lane line".to_string())),
|
||||
);
|
||||
}
|
||||
});
|
||||
y_masks.extend(masks);
|
||||
ys.push(Ys::default().with_bboxes(&ys_bbox).with_masks(&y_masks));
|
||||
|
||||
// save
|
||||
ys.push(
|
||||
Y::default()
|
||||
.with_bboxes(&y_bboxes)
|
||||
.with_masks(&y_masks)
|
||||
.apply_bboxes_nms(self.iou),
|
||||
);
|
||||
}
|
||||
Ok(ys)
|
||||
}
|
||||
|
121
src/utils/coco.rs
Normal file
@ -0,0 +1,121 @@
|
||||
pub const SKELETONS_16: [(usize, usize); 16] = [
|
||||
(0, 1),
|
||||
(0, 2),
|
||||
(1, 3),
|
||||
(2, 4),
|
||||
(5, 6),
|
||||
(5, 11),
|
||||
(6, 12),
|
||||
(11, 12),
|
||||
(5, 7),
|
||||
(6, 8),
|
||||
(7, 9),
|
||||
(8, 10),
|
||||
(11, 13),
|
||||
(12, 14),
|
||||
(13, 15),
|
||||
(14, 16),
|
||||
];
|
||||
|
||||
pub const KEYPOINTS_NAMES_17: [&str; 17] = [
|
||||
"nose",
|
||||
"left_eye",
|
||||
"right_eye",
|
||||
"left_ear",
|
||||
"right_ear",
|
||||
"left_shoulder",
|
||||
"right_shoulder",
|
||||
"left_elbow",
|
||||
"right_elbow",
|
||||
"left_wrist",
|
||||
"right_wrist",
|
||||
"left_hip",
|
||||
"right_hip",
|
||||
"left_knee",
|
||||
"right_knee",
|
||||
"left_ankle",
|
||||
"right_ankle",
|
||||
];
|
||||
|
||||
pub const NAMES_80: [&str; 80] = [
|
||||
"person",
|
||||
"bicycle",
|
||||
"car",
|
||||
"motorcycle",
|
||||
"airplane",
|
||||
"bus",
|
||||
"train",
|
||||
"truck",
|
||||
"boat",
|
||||
"traffic light",
|
||||
"fire hydrant",
|
||||
"stop sign",
|
||||
"parking meter",
|
||||
"bench",
|
||||
"bird",
|
||||
"cat",
|
||||
"dog",
|
||||
"horse",
|
||||
"sheep",
|
||||
"cow",
|
||||
"elephant",
|
||||
"bear",
|
||||
"zebra",
|
||||
"giraffe",
|
||||
"backpack",
|
||||
"umbrella",
|
||||
"handbag",
|
||||
"tie",
|
||||
"suitcase",
|
||||
"frisbee",
|
||||
"skis",
|
||||
"snowboard",
|
||||
"sports ball",
|
||||
"kite",
|
||||
"baseball bat",
|
||||
"baseball glove",
|
||||
"skateboard",
|
||||
"surfboard",
|
||||
"tennis racket",
|
||||
"bottle",
|
||||
"wine glass",
|
||||
"cup",
|
||||
"fork",
|
||||
"knife",
|
||||
"spoon",
|
||||
"bowl",
|
||||
"banana",
|
||||
"apple",
|
||||
"sandwich",
|
||||
"orange",
|
||||
"broccoli",
|
||||
"carrot",
|
||||
"hot dog",
|
||||
"pizza",
|
||||
"donut",
|
||||
"cake",
|
||||
"chair",
|
||||
"couch",
|
||||
"potted plant",
|
||||
"bed",
|
||||
"dining table",
|
||||
"toilet",
|
||||
"tv",
|
||||
"laptop",
|
||||
"mouse",
|
||||
"remote",
|
||||
"keyboard",
|
||||
"cell phone",
|
||||
"microwave",
|
||||
"oven",
|
||||
"toaster",
|
||||
"sink",
|
||||
"refrigerator",
|
||||
"book",
|
||||
"clock",
|
||||
"vase",
|
||||
"scissors",
|
||||
"teddy bear",
|
||||
"hair drier",
|
||||
"toothbrush",
|
||||
];
|