mirror of
https://github.com/mii443/usls.git
synced 2025-12-03 02:58:22 +00:00
221 lines
14 KiB
Markdown
221 lines
14 KiB
Markdown
<p align="center">
|
|
<h2 align="center">usls</h2>
|
|
</p>
|
|
|
|
<p align="center">
|
|
<a href="https://docs.rs/usls"><strong>Documentation</strong></a>
|
|
<br>
|
|
<br>
|
|
<a href='https://github.com/microsoft/onnxruntime/releases'>
|
|
<img src='https://img.shields.io/badge/ONNXRuntime-v1.19.x-239DFF?style=for-the-badge&logo=onnx' alt='ONNXRuntime Release Page'>
|
|
</a>
|
|
<a href='https://developer.nvidia.com/cuda-toolkit-archive'>
|
|
<img src='https://img.shields.io/badge/CUDA-12.x-76B900?style=for-the-badge&logo=nvidia' alt='CUDA Toolkit Page'>
|
|
</a>
|
|
<a href='https://developer.nvidia.com/tensorrt'>
|
|
<img src='https://img.shields.io/badge/TensorRT-10.x.x.x-76B900?style=for-the-badge&logo=nvidia' alt='TensorRT Page'>
|
|
</a>
|
|
</p>
|
|
|
|
<p align="center">
|
|
<a href='https://crates.io/crates/usls'>
|
|
<img src='https://img.shields.io/crates/v/usls.svg?style=for-the-badge&logo=rust' alt='Crates Page'>
|
|
</a>
|
|
<!-- Documentation Badge -->
|
|
<!-- <a href="https://docs.rs/usls">
|
|
<img src='https://img.shields.io/badge/Documents-usls-000000?style=for-the-badge&logo=docs.rs' alt='Documentation'>
|
|
</a> -->
|
|
<!-- Downloads Badge -->
|
|
<a href="">
|
|
<img alt="Crates.io Total Downloads" src="https://img.shields.io/crates/d/usls?style=for-the-badge&color=3ECC5F">
|
|
</a>
|
|
|
|
</p>
|
|
|
|
**`usls`** is a Rust library integrated with **ONNXRuntime** that provides a collection of state-of-the-art models for **Computer Vision** and **Vision-Language** tasks, including:
|
|
|
|
- **YOLO Models**: [YOLOv5](https://github.com/ultralytics/yolov5), [YOLOv6](https://github.com/meituan/YOLOv6), [YOLOv7](https://github.com/WongKinYiu/yolov7), [YOLOv8](https://github.com/ultralytics/ultralytics), [YOLOv9](https://github.com/WongKinYiu/yolov9), [YOLOv10](https://github.com/THU-MIG/yolov10), [YOLOv11](https://github.com/ultralytics/ultralytics)
|
|
- **SAM Models**: [SAM](https://github.com/facebookresearch/segment-anything), [SAM2](https://github.com/facebookresearch/segment-anything-2), [MobileSAM](https://github.com/ChaoningZhang/MobileSAM), [EdgeSAM](https://github.com/chongzhou96/EdgeSAM), [SAM-HQ](https://github.com/SysCV/sam-hq), [FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM)
|
|
- **Vision Models**: [RTDETR](https://arxiv.org/abs/2304.08069), [RTMO](https://github.com/open-mmlab/mmpose/tree/main/projects/rtmo), [DB](https://arxiv.org/abs/1911.08947), [SVTR](https://arxiv.org/abs/2205.00159), [Depth-Anything-v1-v2](https://github.com/LiheYoung/Depth-Anything), [DINOv2](https://github.com/facebookresearch/dinov2), [MODNet](https://github.com/ZHKKKe/MODNet), [Sapiens](https://arxiv.org/abs/2408.12569)
|
|
- **Vision-Language Models**: [CLIP](https://github.com/openai/CLIP), [BLIP](https://arxiv.org/abs/2201.12086), [GroundingDINO](https://github.com/IDEA-Research/GroundingDINO), [YOLO-World](https://github.com/AILab-CVC/YOLO-World), [Florence2](https://arxiv.org/abs/2311.06242)
|
|
|
|
<details>
|
|
<summary>Click to expand Supported Models</summary>
|
|
|
|
## Supported Models
|
|
|
|
| Model | Task / Type | Example | CUDA f32 | CUDA f16 | TensorRT f32 | TensorRT f16 |
|
|
|---------------------------------------------------------------------|----------------------------------------------------------------------------------------------|----------------------------|----------|----------|--------------|--------------|
|
|
| [YOLOv5](https://github.com/ultralytics/yolov5) | Classification<br>Object Detection<br>Instance Segmentation | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ |
|
|
| [YOLOv6](https://github.com/meituan/YOLOv6) | Object Detection | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ |
|
|
| [YOLOv7](https://github.com/WongKinYiu/yolov7) | Object Detection | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ |
|
|
| [YOLOv8](https://github.com/ultralytics/ultralytics) | Object Detection<br>Instance Segmentation<br>Classification<br>Oriented Object Detection<br>Keypoint Detection | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ |
|
|
| [YOLOv8](https://github.com/ultralytics/ultralytics) | Object Detection<br>Instance Segmentation<br>Classification<br>Oriented Object Detection<br>Keypoint Detection | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ |
|
|
| [YOLOv9](https://github.com/WongKinYiu/yolov9) | Object Detection | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ |
|
|
| [YOLOv11](https://github.com/ultralytics/ultralytics) | Object Detection<br>Instance Segmentation<br>Classification<br>Oriented Object Detection<br>Keypoint Detection | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ |
|
|
| [RTDETR](https://arxiv.org/abs/2304.08069) | Object Detection | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ |
|
|
| [FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM) | Instance Segmentation | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ |
|
|
| [SAM](https://github.com/facebookresearch/segment-anything) | Segment Anything | [demo](examples/sam) | ✅ | ✅ | | |
|
|
| [SAM2](https://github.com/facebookresearch/segment-anything-2) | Segment Anything | [demo](examples/sam) | ✅ | ✅ | | |
|
|
| [MobileSAM](https://github.com/ChaoningZhang/MobileSAM) | Segment Anything | [demo](examples/sam) | ✅ | ✅ | | |
|
|
| [EdgeSAM](https://github.com/chongzhou96/EdgeSAM) | Segment Anything | [demo](examples/sam) | ✅ | ✅ | | |
|
|
| [SAM-HQ](https://github.com/SysCV/sam-hq) | Segment Anything | [demo](examples/sam) | ✅ | ✅ | | |
|
|
| [YOLO-World](https://github.com/AILab-CVC/YOLO-World) | Object Detection | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ |
|
|
| [DINOv2](https://github.com/facebookresearch/dinov2) | Vision-Self-Supervised | [demo](examples/dinov2) | ✅ | ✅ | ✅ | ✅ |
|
|
| [CLIP](https://github.com/openai/CLIP) | Vision-Language | [demo](examples/clip) | ✅ | ✅ | ✅ Visual<br>❌ Textual | ✅ Visual<br>❌ Textual |
|
|
| [BLIP](https://github.com/salesforce/BLIP) | Vision-Language | [demo](examples/blip) | ✅ | ✅ | ✅ Visual<br>❌ Textual | ✅ Visual<br>❌ Textual |
|
|
| [DB](https://arxiv.org/abs/1911.08947) | Text Detection | [demo](examples/db) | ✅ | ✅ | ✅ | ✅ |
|
|
| [SVTR](https://arxiv.org/abs/2205.00159) | Text Recognition | [demo](examples/svtr) | ✅ | ✅ | ✅ | ✅ |
|
|
| [RTMO](https://github.com/open-mmlab/mmpose/tree/main/projects/rtmo) | Keypoint Detection | [demo](examples/rtmo) | ✅ | ✅ | ❌ | ❌ |
|
|
| [YOLOPv2](https://arxiv.org/abs/2208.11434) | Panoptic Driving Perception | [demo](examples/yolop) | ✅ | ✅ | ✅ | ✅ |
|
|
| [Depth-Anything](https://github.com/LiheYoung/Depth-Anything) | Monocular Depth Estimation | [demo](examples/depth-anything) | ✅ | ✅ | ❌ | ❌ |
|
|
| [MODNet](https://github.com/ZHKKKe/MODNet) | Image Matting | [demo](examples/modnet) | ✅ | ✅ | ✅ | ✅ |
|
|
| [GroundingDINO](https://github.com/IDEA-Research/GroundingDINO) | Open-Set Detection With Language | [demo](examples/grounding-dino) | ✅ | ✅ | | |
|
|
| [Sapiens](https://github.com/facebookresearch/sapiens/tree/main) | Body Part Segmentation | [demo](examples/sapiens) | ✅ | ✅ | | |
|
|
| [Florence2](https://arxiv.org/abs/2311.06242) | a Variety of Vision Tasks | [demo](examples/florence2) | ✅ | ✅ | | |
|
|
|
|
|
|
|
|
</details>
|
|
|
|
|
|
## ⛳️ ONNXRuntime Linking
|
|
|
|
<details>
|
|
<summary>You have two options to link the ONNXRuntime library</summary>
|
|
|
|
- ### Option 1: Manual Linking
|
|
|
|
- #### For detailed setup instructions, refer to the [ORT documentation](https://ort.pyke.io/setup/linking).
|
|
|
|
- #### For Linux or macOS Users:
|
|
- Download the ONNX Runtime package from the [Releases page](https://github.com/microsoft/onnxruntime/releases).
|
|
- Set up the library path by exporting the `ORT_DYLIB_PATH` environment variable:
|
|
```shell
|
|
export ORT_DYLIB_PATH=/path/to/onnxruntime/lib/libonnxruntime.so.1.19.0
|
|
```
|
|
|
|
- ### Option 2: Automatic Download
|
|
Just use `--features auto`
|
|
```shell
|
|
cargo run -r --example yolo --features auto
|
|
```
|
|
|
|
</details>
|
|
|
|
## 🎈 Demo
|
|
|
|
```Shell
|
|
cargo run -r --example yolo # blip, clip, yolop, svtr, db, ...
|
|
```
|
|
|
|
## 🥂 Integrate Into Your Own Project
|
|
|
|
- #### Add `usls` as a dependency to your project's `Cargo.toml`
|
|
```Shell
|
|
cargo add usls
|
|
```
|
|
|
|
Or use a specific commit:
|
|
```Toml
|
|
[dependencies]
|
|
usls = { git = "https://github.com/jamjamjon/usls", rev = "commit-sha" }
|
|
```
|
|
|
|
- #### Follow the pipeline
|
|
- Build model with the provided `models` and `Options`
|
|
- Load images, video and stream with `DataLoader`
|
|
- Do inference
|
|
- Retrieve inference results from `Vec<Y>`
|
|
- Annotate inference results with `Annotator`
|
|
- Display images and write them to video with `Viewer`
|
|
|
|
<br/>
|
|
<details>
|
|
<summary>example code</summary>
|
|
|
|
```rust
|
|
use usls::{models::YOLO, Annotator, DataLoader, Nms, Options, Vision, YOLOTask, YOLOVersion};
|
|
|
|
fn main() -> anyhow::Result<()> {
|
|
// Build model with Options
|
|
let options = Options::new()
|
|
.with_trt(0)
|
|
.with_model("yolo/v8-m-dyn.onnx")?
|
|
.with_yolo_version(YOLOVersion::V8) // YOLOVersion: V5, V6, V7, V8, V9, V10, RTDETR
|
|
.with_yolo_task(YOLOTask::Detect) // YOLOTask: Classify, Detect, Pose, Segment, Obb
|
|
.with_ixx(0, 0, (1, 2, 4).into())
|
|
.with_ixx(0, 2, (0, 640, 640).into())
|
|
.with_ixx(0, 3, (0, 640, 640).into())
|
|
.with_confs(&[0.2]);
|
|
let mut model = YOLO::new(options)?;
|
|
|
|
// Build DataLoader to load image(s), video, stream
|
|
let dl = DataLoader::new(
|
|
// "./assets/bus.jpg", // local image
|
|
// "images/bus.jpg", // remote image
|
|
// "../images-folder", // local images (from folder)
|
|
// "../demo.mp4", // local video
|
|
// "http://commondatastorage.googleapis.com/gtv-videos-bucket/sample/BigBuckBunny.mp4", // online video
|
|
"rtsp://admin:kkasd1234@192.168.2.217:554/h264/ch1/", // stream
|
|
)?
|
|
.with_batch(2) // iterate with batch_size = 2
|
|
.build()?;
|
|
|
|
// Build annotator
|
|
let annotator = Annotator::new()
|
|
.with_bboxes_thickness(4)
|
|
.with_saveout("YOLO-DataLoader");
|
|
|
|
// Build viewer
|
|
let mut viewer = Viewer::new().with_delay(10).with_scale(1.).resizable(true);
|
|
|
|
// Run and annotate results
|
|
for (xs, _) in dl {
|
|
let ys = model.forward(&xs, false)?;
|
|
// annotator.annotate(&xs, &ys);
|
|
let images_plotted = annotator.plot(&xs, &ys, false)?;
|
|
|
|
// show image
|
|
viewer.imshow(&images_plotted)?;
|
|
|
|
// check out window and key event
|
|
if !viewer.is_open() || viewer.is_key_pressed(usls::Key::Escape) {
|
|
break;
|
|
}
|
|
|
|
// write video
|
|
viewer.write_batch(&images_plotted)?;
|
|
|
|
// Retrieve inference results
|
|
for y in ys {
|
|
// bboxes
|
|
if let Some(bboxes) = y.bboxes() {
|
|
for bbox in bboxes {
|
|
println!(
|
|
"Bbox: {}, {}, {}, {}, {}, {}",
|
|
bbox.xmin(),
|
|
bbox.ymin(),
|
|
bbox.xmax(),
|
|
bbox.ymax(),
|
|
bbox.confidence(),
|
|
bbox.id(),
|
|
);
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
// finish video write
|
|
viewer.finish_write()?;
|
|
|
|
Ok(())
|
|
}
|
|
```
|
|
|
|
</details>
|
|
</br>
|
|
|
|
## 📌 License
|
|
This project is licensed under [LICENSE](LICENSE).
|