# BiRefNet Background Removal Service GPU-accelerated background removal exposed as an HTTP API. Uses [BiRefNet](https://huggingface.co/ZhengPeng7/BiRefNet) for matting, served with [LitServe](https://github.com/Lightning-AI/LitServe), packaged for the NVIDIA container runtime. ## Requirements - NVIDIA GPU + driver, Docker, and the `nvidia` container runtime - ~2 GB free disk for the model weights (downloaded on first run) ## Quick start ```bash make build # build the Docker image make run # start the service on :8000 (GPU) make logs # watch startup — first run downloads BiRefNet weights make test # send test.jpg, save output.png ``` `make test` waits for the service `/health` endpoint before sending the request, so the first call may block while the model downloads and loads. ### Web UI A minimal test page is served at the service root — open **http://localhost:8000/** in a browser, drop in an image, and preview the transparent-background result (handy when working over SSH). It calls the same `/predict` endpoint. ### Useful variations ```bash make test BG=white # composite onto a white background make test INPUT=photo.jpg OUTPUT=cut.png make test-mask # also save the raw alpha mask (mask.png) make help # list all targets ``` ## API `POST /predict` ```jsonc { "image": "", // required "background": "alpha", // alpha|white|black|gray|green|blue|red "mask_blur": 0, // Gaussian blur radius on mask edges "return_mask": false // include the raw mask in the response } ``` Response: ```jsonc { "image": "", "format": "png", "width": 3637, "height": 3637, "mask": "" // only when return_mask=true } ``` `GET /health` returns 200 when the service is ready. ## Configuration (environment variables) | Variable | Default | Purpose | |----------------------|----------------------|----------------------------------| | `PORT` | `8000` | HTTP port | | `BIREFNET_MODEL` | `ZhengPeng7/BiRefNet`| HuggingFace repo for the weights | | `BIREFNET_RESOLUTION`| `1024` | Inference resolution | | `REQUEST_TIMEOUT` | `120` | Per-request timeout (seconds) | ## Local development (no Docker) Requires a local CUDA-capable PyTorch environment. ```bash make dev # uv sync + run the server locally ``` ## Layout ``` src/birefnet_service/model.py BiRefNet wrapper (load + inference) src/birefnet_service/server.py LitServe API + web UI route src/birefnet_service/static/ web UI (index.html) scripts/client.py stdlib-only test client Dockerfile / docker-compose.yml CUDA image + nvidia runtime Makefile build / run / test shortcuts ```