5.3 KiB
BiRefNet Background Removal Service
GPU-accelerated background removal as an HTTP API. Two pipelines:
- Auto — BiRefNet / RMBG-2.0 salient-object matting.
- Prompt — GroundingDINO
- SAM: segment whatever a text prompt describes.
Served with LitServe, packaged for the NVIDIA container runtime.
Requirements
- NVIDIA GPU + driver, Docker, and the
nvidiacontainer runtime - ~5 GB free disk for model weights (downloaded on first use, cached in a volume)
Quick start
make build # build the Docker image
make run # start the service on :8000 (GPU)
make logs # watch startup — first run downloads model weights
make test # send test.jpg, save output.png
make test waits for /health before sending, so the first call may block
while a model downloads and loads.
Web UI
Open http://localhost:8000/ — a single-page test app (handy over SSH):
- Auto remove — pick a model variant + resolution.
- Prompt segment — type what to keep (e.g.
the dog), tune the GroundingDINO box / text thresholds.
Both modes support a transparency checkerboard preview, click-to-zoom lightbox, optional crop-to-subject, and download.
Keyboard shortcuts
The UI is fully keyboard-drivable. Shortcuts are ignored while typing in a field and while Ctrl/Cmd/Alt is held.
| Key | Action |
|---|---|
B |
Toggle the controls sidebar |
U |
Open the file picker to upload an image |
I / O |
Show the input / output image |
F / Z |
Open the zoom view for the visible image |
S |
Save (download PNG), once a result exists |
In the zoom view:
| Key | Action |
|---|---|
F / Z / Esc |
Close the zoom view |
+ / - |
Zoom in / out (1×–8×) |
0 |
Reset zoom & pan |
Arrows or H J K L |
Pan (while zoomed past 1×) |
API
POST /predict — automatic background removal
{
"image": "<base64 image bytes>", // required
"model": "HR", // general|HR|portrait|matting|lite|rmbg2
"resolution": 2048, // inference resolution (×32)
"background": "alpha", // alpha|white|black|gray|green|blue|red
"mask_blur": 0, // Gaussian blur radius on mask edges
"crop": false, // crop to the foreground bounding box
"crop_margin": 0.0, // crop margin in inches (uses image DPI)
"return_mask": false // include the raw mask in the response
}
POST /segment — prompt-conditioned segmentation
{
"image": "<base64 image bytes>", // required
"prompt": "the dog", // required — object(s) to keep
"box_threshold": 0.3, // GroundingDINO detection threshold
"text_threshold": 0.25,
"background": "alpha",
"mask_blur": 0,
"crop": false,
"crop_margin": 0.0
}
Response (both): image (base64 PNG), format, width, height, plus
model/resolution (/predict) or detections/prompt (/segment).
GET /health returns 200 when the service is ready.
CLI
python3 scripts/client.py --input photo.jpg --output cut.png --model HR --resolution 2048 --crop
python3 scripts/client.py --input photo.jpg --output dog.png --prompt "the dog" --crop
Configuration (environment variables)
| Variable | Default | Purpose |
|---|---|---|
PORT |
8000 |
HTTP port |
BIREFNET_MODEL |
general |
Default Auto variant |
BIREFNET_RESOLUTION |
1024 |
Default Auto resolution |
DINO_MODEL |
IDEA-Research/grounding-dino-tiny |
GroundingDINO checkpoint |
SAM_MODEL |
facebook/sam-vit-large |
SAM checkpoint |
REQUEST_TIMEOUT |
120 |
Per-request timeout (seconds) |
Local development (no Docker)
Requires a local CUDA-capable PyTorch environment.
make dev # uv sync + run the server locally
Layout
src/rmbg_as_a_service/model.py BiRefNet / RMBG-2.0 wrapper + compositing
src/rmbg_as_a_service/prompt_segment.py GroundingDINO + SAM pipeline
src/rmbg_as_a_service/server.py LitServe /predict + /segment + web UI
src/rmbg_as_a_service/static/ web UI (index.html + styles.css)
scripts/client.py stdlib-only test client
Dockerfile / compose.yml CUDA image + nvidia runtime
Makefile build / run / test shortcuts