95 lines
2.9 KiB
Markdown
95 lines
2.9 KiB
Markdown
# BiRefNet Background Removal Service
|
|
|
|
GPU-accelerated background removal exposed as an HTTP API. Uses
|
|
[BiRefNet](https://huggingface.co/ZhengPeng7/BiRefNet) for matting, served with
|
|
[LitServe](https://github.com/Lightning-AI/LitServe), packaged for the
|
|
NVIDIA container runtime.
|
|
|
|
## Requirements
|
|
|
|
- NVIDIA GPU + driver, Docker, and the `nvidia` container runtime
|
|
- ~2 GB free disk for the model weights (downloaded on first run)
|
|
|
|
## Quick start
|
|
|
|
```bash
|
|
make build # build the Docker image
|
|
make run # start the service on :8000 (GPU)
|
|
make logs # watch startup — first run downloads BiRefNet weights
|
|
make test # send test.jpg, save output.png
|
|
```
|
|
|
|
`make test` waits for the service `/health` endpoint before sending the
|
|
request, so the first call may block while the model downloads and loads.
|
|
|
|
### Web UI
|
|
|
|
A minimal test page is served at the service root — open
|
|
**http://localhost:8000/** in a browser, drop in an image, and preview the
|
|
transparent-background result (handy when working over SSH). It calls the
|
|
same `/predict` endpoint.
|
|
|
|
### Useful variations
|
|
|
|
```bash
|
|
make test BG=white # composite onto a white background
|
|
make test INPUT=photo.jpg OUTPUT=cut.png
|
|
make test-mask # also save the raw alpha mask (mask.png)
|
|
make help # list all targets
|
|
```
|
|
|
|
## API
|
|
|
|
`POST /predict`
|
|
|
|
```jsonc
|
|
{
|
|
"image": "<base64 image bytes>", // required
|
|
"background": "alpha", // alpha|white|black|gray|green|blue|red
|
|
"mask_blur": 0, // Gaussian blur radius on mask edges
|
|
"return_mask": false // include the raw mask in the response
|
|
}
|
|
```
|
|
|
|
Response:
|
|
|
|
```jsonc
|
|
{
|
|
"image": "<base64 PNG>",
|
|
"format": "png",
|
|
"width": 3637,
|
|
"height": 3637,
|
|
"mask": "<base64 PNG>" // only when return_mask=true
|
|
}
|
|
```
|
|
|
|
`GET /health` returns 200 when the service is ready.
|
|
|
|
## Configuration (environment variables)
|
|
|
|
| Variable | Default | Purpose |
|
|
|----------------------|----------------------|----------------------------------|
|
|
| `PORT` | `8000` | HTTP port |
|
|
| `BIREFNET_MODEL` | `ZhengPeng7/BiRefNet`| HuggingFace repo for the weights |
|
|
| `BIREFNET_RESOLUTION`| `1024` | Inference resolution |
|
|
| `REQUEST_TIMEOUT` | `120` | Per-request timeout (seconds) |
|
|
|
|
## Local development (no Docker)
|
|
|
|
Requires a local CUDA-capable PyTorch environment.
|
|
|
|
```bash
|
|
make dev # uv sync + run the server locally
|
|
```
|
|
|
|
## Layout
|
|
|
|
```
|
|
src/birefnet_service/model.py BiRefNet wrapper (load + inference)
|
|
src/birefnet_service/server.py LitServe API + web UI route
|
|
src/birefnet_service/static/ web UI (index.html)
|
|
scripts/client.py stdlib-only test client
|
|
Dockerfile / docker-compose.yml CUDA image + nvidia runtime
|
|
Makefile build / run / test shortcuts
|
|
```
|