rmbg/README.md
Michael Pilosov 96d16fc654 mvp
2026-05-16 14:32:26 -06:00

95 lines
2.9 KiB
Markdown

# BiRefNet Background Removal Service
GPU-accelerated background removal exposed as an HTTP API. Uses
[BiRefNet](https://huggingface.co/ZhengPeng7/BiRefNet) for matting, served with
[LitServe](https://github.com/Lightning-AI/LitServe), packaged for the
NVIDIA container runtime.
## Requirements
- NVIDIA GPU + driver, Docker, and the `nvidia` container runtime
- ~2 GB free disk for the model weights (downloaded on first run)
## Quick start
```bash
make build # build the Docker image
make run # start the service on :8000 (GPU)
make logs # watch startup — first run downloads BiRefNet weights
make test # send test.jpg, save output.png
```
`make test` waits for the service `/health` endpoint before sending the
request, so the first call may block while the model downloads and loads.
### Web UI
A minimal test page is served at the service root — open
**http://localhost:8000/** in a browser, drop in an image, and preview the
transparent-background result (handy when working over SSH). It calls the
same `/predict` endpoint.
### Useful variations
```bash
make test BG=white # composite onto a white background
make test INPUT=photo.jpg OUTPUT=cut.png
make test-mask # also save the raw alpha mask (mask.png)
make help # list all targets
```
## API
`POST /predict`
```jsonc
{
"image": "<base64 image bytes>", // required
"background": "alpha", // alpha|white|black|gray|green|blue|red
"mask_blur": 0, // Gaussian blur radius on mask edges
"return_mask": false // include the raw mask in the response
}
```
Response:
```jsonc
{
"image": "<base64 PNG>",
"format": "png",
"width": 3637,
"height": 3637,
"mask": "<base64 PNG>" // only when return_mask=true
}
```
`GET /health` returns 200 when the service is ready.
## Configuration (environment variables)
| Variable | Default | Purpose |
|----------------------|----------------------|----------------------------------|
| `PORT` | `8000` | HTTP port |
| `BIREFNET_MODEL` | `ZhengPeng7/BiRefNet`| HuggingFace repo for the weights |
| `BIREFNET_RESOLUTION`| `1024` | Inference resolution |
| `REQUEST_TIMEOUT` | `120` | Per-request timeout (seconds) |
## Local development (no Docker)
Requires a local CUDA-capable PyTorch environment.
```bash
make dev # uv sync + run the server locally
```
## Layout
```
src/birefnet_service/model.py BiRefNet wrapper (load + inference)
src/birefnet_service/server.py LitServe API + web UI route
src/birefnet_service/static/ web UI (index.html)
scripts/client.py stdlib-only test client
Dockerfile / docker-compose.yml CUDA image + nvidia runtime
Makefile build / run / test shortcuts
```