138 lines
5.3 KiB
Markdown
138 lines
5.3 KiB
Markdown
# BiRefNet Background Removal Service
|
||
|
||
GPU-accelerated background removal as an HTTP API. Two pipelines:
|
||
|
||
- **Auto** — [BiRefNet](https://huggingface.co/ZhengPeng7/BiRefNet) /
|
||
[RMBG-2.0](https://huggingface.co/briaai/RMBG-2.0) salient-object matting.
|
||
- **Prompt** — [GroundingDINO](https://huggingface.co/IDEA-Research/grounding-dino-tiny)
|
||
+ [SAM](https://huggingface.co/facebook/sam-vit-base): segment whatever a text
|
||
prompt describes.
|
||
|
||
Served with [LitServe](https://github.com/Lightning-AI/LitServe), packaged for
|
||
the NVIDIA container runtime.
|
||
|
||
## Requirements
|
||
|
||
- NVIDIA GPU + driver, Docker, and the `nvidia` container runtime
|
||
- ~5 GB free disk for model weights (downloaded on first use, cached in a volume)
|
||
|
||
## Quick start
|
||
|
||
```bash
|
||
make build # build the Docker image
|
||
make run # start the service on :8000 (GPU)
|
||
make logs # watch startup — first run downloads model weights
|
||
make test # send test.jpg, save output.png
|
||
```
|
||
|
||
`make test` waits for `/health` before sending, so the first call may block
|
||
while a model downloads and loads.
|
||
|
||
### Web UI
|
||
|
||
Open **http://localhost:8000/** — a single-page test app (handy over SSH):
|
||
|
||
- **Auto remove** — pick a model variant + resolution.
|
||
- **Prompt segment** — type what to keep (e.g. `the dog`), tune the
|
||
GroundingDINO box / text thresholds.
|
||
|
||
Both modes support a transparency checkerboard preview, click-to-zoom lightbox,
|
||
optional crop-to-subject, and download.
|
||
|
||
#### Keyboard shortcuts
|
||
|
||
The UI is fully keyboard-drivable. Shortcuts are ignored while typing in a
|
||
field and while Ctrl/Cmd/Alt is held.
|
||
|
||
| Key | Action |
|
||
|---------------------|-----------------------------------------------|
|
||
| `B` | Toggle the controls sidebar |
|
||
| `U` | Open the file picker to upload an image |
|
||
| `I` / `O` | Show the input / output image |
|
||
| `F` / `Z` | Open the zoom view for the visible image |
|
||
| `S` | Save (download PNG), once a result exists |
|
||
|
||
In the zoom view:
|
||
|
||
| Key | Action |
|
||
|---------------------------|-----------------------------------------|
|
||
| `F` / `Z` / `Esc` | Close the zoom view |
|
||
| `+` / `-` | Zoom in / out (1×–8×) |
|
||
| `0` | Reset zoom & pan |
|
||
| Arrows or `H` `J` `K` `L` | Pan (while zoomed past 1×) |
|
||
|
||
## API
|
||
|
||
### `POST /predict` — automatic background removal
|
||
|
||
```jsonc
|
||
{
|
||
"image": "<base64 image bytes>", // required
|
||
"model": "HR", // general|HR|portrait|matting|lite|rmbg2
|
||
"resolution": 2048, // inference resolution (×32)
|
||
"background": "alpha", // alpha|white|black|gray|green|blue|red
|
||
"mask_blur": 0, // Gaussian blur radius on mask edges
|
||
"crop": false, // crop to the foreground bounding box
|
||
"crop_margin": 0.0, // crop margin in inches (uses image DPI)
|
||
"return_mask": false // include the raw mask in the response
|
||
}
|
||
```
|
||
|
||
### `POST /segment` — prompt-conditioned segmentation
|
||
|
||
```jsonc
|
||
{
|
||
"image": "<base64 image bytes>", // required
|
||
"prompt": "the dog", // required — object(s) to keep
|
||
"box_threshold": 0.3, // GroundingDINO detection threshold
|
||
"text_threshold": 0.25,
|
||
"background": "alpha",
|
||
"mask_blur": 0,
|
||
"crop": false,
|
||
"crop_margin": 0.0
|
||
}
|
||
```
|
||
|
||
Response (both): `image` (base64 PNG), `format`, `width`, `height`, plus
|
||
`model`/`resolution` (`/predict`) or `detections`/`prompt` (`/segment`).
|
||
|
||
`GET /health` returns 200 when the service is ready.
|
||
|
||
## CLI
|
||
|
||
```bash
|
||
python3 scripts/client.py --input photo.jpg --output cut.png --model HR --resolution 2048 --crop
|
||
python3 scripts/client.py --input photo.jpg --output dog.png --prompt "the dog" --crop
|
||
```
|
||
|
||
## Configuration (environment variables)
|
||
|
||
| Variable | Default | Purpose |
|
||
|----------------------|--------------------------------|-------------------------------|
|
||
| `PORT` | `8000` | HTTP port |
|
||
| `BIREFNET_MODEL` | `general` | Default Auto variant |
|
||
| `BIREFNET_RESOLUTION`| `1024` | Default Auto resolution |
|
||
| `DINO_MODEL` | `IDEA-Research/grounding-dino-tiny` | GroundingDINO checkpoint |
|
||
| `SAM_MODEL` | `facebook/sam-vit-large` | SAM checkpoint |
|
||
| `REQUEST_TIMEOUT` | `120` | Per-request timeout (seconds) |
|
||
|
||
## Local development (no Docker)
|
||
|
||
Requires a local CUDA-capable PyTorch environment.
|
||
|
||
```bash
|
||
make dev # uv sync + run the server locally
|
||
```
|
||
|
||
## Layout
|
||
|
||
```
|
||
src/rmbg_as_a_service/model.py BiRefNet / RMBG-2.0 wrapper + compositing
|
||
src/rmbg_as_a_service/prompt_segment.py GroundingDINO + SAM pipeline
|
||
src/rmbg_as_a_service/server.py LitServe /predict + /segment + web UI
|
||
src/rmbg_as_a_service/static/ web UI (index.html + styles.css)
|
||
scripts/client.py stdlib-only test client
|
||
Dockerfile / compose.yml CUDA image + nvidia runtime
|
||
Makefile build / run / test shortcuts
|
||
```
|