usls/README.md at af934086bb1485063d13198d6841eea10e2fcd6f

mii/usls

Fork 0

mirror of https://github.com/mii443/usls.git synced 2025-08-22 15:45:41 +00:00

Files

jamjamjon af934086bb Initial

2024-03-29 15:54:24 +08:00

1.1 KiB

Raw Blame History

This demo shows how to use BLIP to do conditional or unconditional image captioning.

Quick Start

cargo run -r --example blip

Or you can manully

1. Donwload CLIP ONNX Model

blip-visual-base
blip-textual-base

2. Specify the ONNX model path in `main.rs`

    // visual
    let options_visual = Options::default()
        .with_model("VISUAL_MODEL")   // <= modify this
        .with_profile(false);

    // textual
    let options_textual = Options::default()
        .with_model("TEXTUAL_MODEL")  // <= modify this
        .with_profile(false);

3. Then, run

cargo run -r --example blip

Results

[Unconditional image captioning]: a group of people walking around a bus
[Conditional image captioning]: three man walking in front of a bus

TODO

text decode with Top-p sample
VQA
Retrival
TensorRT support for textual model

1.1 KiB Raw Blame History