FlowDIS: Language-Guided Dichotomous Image Segmentation with Flow Matching
Paper • 2605.05077 • Published
FlowDIS enables highly accurate foreground segmentation, optionally guided by a text prompt. When ambiguity prevents the model from producing the desired result, the user can specify which elements to retain in the foreground.
pip install "git+ssh://git@github.com/Picsart-AI-Research/FlowDIS.git"
from PIL import Image
from flowdis import flowdis_predict, load_models
models = load_models(device="cuda")
input_img_path = "path/to/input.jpg" # Input image path
output_mask_path = "path/to/output.png" # Path to save the output mask
image = Image.open(input_img_path).convert("RGB")
mask = flowdis_predict(
image=image,
prompt="", # Text prompt
models=models,
resolution=1024,
num_inference_steps=2,
device="cuda",
)
mask.save(output_mask_path)
This model is licensed under the PicsArt Inc. FlowDIS Model License.
This project is built on top of FLUX.1 [schnell] and DIS5K.
If you use our work in your research, please cite our publication:
@article{sargsyan2026flowdis,
title={{FlowDIS: Language-Guided Dichotomous Image Segmentation with Flow Matching}},
author={Sargsyan, Andranik and Navasardyan, Shant},
journal={arXiv preprint arXiv:2605.05077},
year={2026},
eprint={2605.05077},
archivePrefix={arXiv},
url={https://arxiv.org/abs/2605.05077}
}