Files
usls/examples/blip/README.md
jamjamjon af934086bb Initial
2024-03-29 15:54:24 +08:00

1.1 KiB

This demo shows how to use BLIP to do conditional or unconditional image captioning.

Quick Start

cargo run -r --example blip

Or you can manully

1. Donwload CLIP ONNX Model

blip-visual-base
blip-textual-base

2. Specify the ONNX model path in main.rs

    // visual
    let options_visual = Options::default()
        .with_model("VISUAL_MODEL")   // <= modify this
        .with_profile(false);

    // textual
    let options_textual = Options::default()
        .with_model("TEXTUAL_MODEL")  // <= modify this
        .with_profile(false);

3. Then, run

cargo run -r --example blip

Results

[Unconditional image captioning]: a group of people walking around a bus
[Conditional image captioning]: three man walking in front of a bus

TODO

  • text decode with Top-p sample
  • VQA
  • Retrival
  • TensorRT support for textual model