usls/blip at f0fd4936e894ddbbcdc70d65df3e2b8961e275bd - usls

mii/usls

mirror of https://github.com/mii443/usls.git synced 2025-12-03 11:08:20 +00:00

Files

Jamjamjon f0fd4936e8 Add florence2 model

* Add florence2-base model for all tasks

* Update annotator.rs

2024-09-21 20:23:42 +08:00

main.rs

Add florence2 model

2024-09-21 20:23:42 +08:00

README.md

0.0.14: DataLoader now support video and streaming

2024-09-16 10:41:16 +08:00

README.md

This demo shows how to use BLIP to do conditional or unconditional image captioning.

Quick Start

cargo run -r --example blip

Results

[Unconditional]: a group of people walking around a bus
[Conditional]: three man walking in front of a bus
Some(["three man walking in front of a bus"])

TODO

Multi-batch inference for image caption
VQA
Retrival
TensorRT support for textual model