usls/blip at af934086bb1485063d13198d6841eea10e2fcd6f - usls

mii/usls

Fork 0

mirror of https://github.com/mii443/usls.git synced 2025-08-22 15:45:41 +00:00

Files

History

jamjamjon af934086bb Initial

2024-03-29 15:54:24 +08:00

main.rs

Initial

2024-03-29 15:54:24 +08:00

README.md

Initial

2024-03-29 15:54:24 +08:00

README.md

This demo shows how to use BLIP to do conditional or unconditional image captioning.

Quick Start

cargo run -r --example blip

Or you can manully

1. Donwload CLIP ONNX Model

blip-visual-base
blip-textual-base

2. Specify the ONNX model path in `main.rs`

    // visual
    let options_visual = Options::default()
        .with_model("VISUAL_MODEL")   // <= modify this
        .with_profile(false);

    // textual
    let options_textual = Options::default()
        .with_model("TEXTUAL_MODEL")  // <= modify this
        .with_profile(false);

3. Then, run

cargo run -r --example blip

Results

[Unconditional image captioning]: a group of people walking around a bus
[Conditional image captioning]: three man walking in front of a bus

TODO

text decode with Top-p sample
VQA
Retrival
TensorRT support for textual model

README.md

Quick Start

Or you can manully

1. Donwload CLIP ONNX Model

2. Specify the ONNX model path in main.rs

3. Then, run

Results

TODO

2. Specify the ONNX model path in `main.rs`