A Virtual Try-On Workflow for Apparel
Virtual try-on — compositing a garment onto a model photograph without a physical fitting or studio reshoot — is one of the highest-value AI applications in e-commerce apparel. The challenge is achieving realistic draping, shadow, and fabric behavior without visible seams or color contamination between garment and background. This guide shows how to build a Floniks workflow that takes a flat-lay garment image and a model photograph as separate inputs, performs segmentation masking, applies garment-conditioned generation, and validates the composite result before routing to your product listing pipeline.
Why Virtual Try-On Demands a Multi-Step Workflow
A successful virtual try-on composite requires at least four distinct operations that must happen in a specific order and each depends on the output of the previous step. First, the garment must be extracted from its background (flat-lay photography on a white or gray surface works best). Second, the model photograph must be segmented to isolate the body region where the garment will appear. Third, the garment must be warped to match the model's body pose — arms raised, torso angle, visible depth — before it is composited. Fourth, the composite must be refined to add realistic shadows, fabric folds, and color harmonization so the garment looks like it is physically on the model rather than pasted onto a photograph.
Running these four operations as a single-prompt task is not possible because each requires different conditioning inputs and model capabilities. Single-prompt approaches produce the notorious "pasted" look: flat garment color with no shadow, garment outline visible, and mismatched lighting between garment and model background. A Floniks multi-step workflow orchestrates these four operations as a DAG, passing the output of each step as a conditioned input to the next, and delivers a composited image that passes a realism validation check before entering the product pipeline.
Stage One: Garment and Model Segmentation
The workflow begins with two parallel segmentation nodes running simultaneously. The first receives the flat-lay garment image and outputs a segmentation mask that isolates the garment from its background, preserving fine details at collar seams, button edges, and hem lines. Set the segmentation node's edge refinement parameter high (0.90 or above) to prevent fringe artifacts along the garment boundary, which will compound into visible seams in the composite.
The second segmentation node receives the model photograph and outputs a body region mask marking the area where the garment will be placed — typically the torso and arms region, but configurable per garment category (full-body for dresses, lower-body for trousers). For the body mask, include a pose estimation pass: the node should also output a skeletal keypoint map (shoulders, elbows, wrists, hip line) that the garment warping stage will use to deform the garment geometry. Both segmentation nodes run in parallel because neither depends on the other's output; the workflow's execution engine launches them simultaneously, reducing total processing time.
Stage Two: Garment Warping and Pose Alignment
The garment warping node receives three inputs: the extracted garment image, the garment segmentation mask, and the skeletal keypoint map from the body segmentation node. It deforms the garment to align with the model's pose: widening across the shoulder line, foreshortening at the torso based on the model's slight forward lean, and curving collar and hem lines to match the three-dimensional body surface. This step is critical — if the garment geometry does not match the model's pose, no amount of inpainting in the refinement stage will produce a convincing composite.
In the Floniks editor, configure the garping node's warp intensity parameter based on the garment category. Rigid structured garments like blazers require lower warp intensity (0.55–0.65) because they hold their shape on the body. Soft draped garments like knitwear or loose blouses require higher intensity (0.75–0.85) to simulate how the fabric follows body curves and gravity. The node outputs a pose-aligned garment image at the same dimensions as the model photograph, ready for compositing.
Stage Three: Compositing and Shadow Synthesis
The compositing node receives the pose-aligned garment, the model photograph, and the body region mask. It blends the garment into the body region using inpainting: the mask defines the edit boundary, the model photograph provides the background context (arms, neck, background), and the pose-aligned garment provides the fill content. Set the inpainting node's strength parameter to 0.45–0.55 — lower than a full generation pass, because you want the model's body structure and background lighting to remain intact while the garment is inserted.
After compositing, connect a Shadow Synthesis node. This node analyzes the light direction in the model photograph (detected from highlights on skin and background) and generates contact shadows along the garment's lower edges and collar line. Shadow synthesis turns a pasted composite into a scene-integrated image. Configure the shadow color as a dark-shifted version of the garment's primary color (not pure black), shadow falloff distance between 8–15 px depending on the image resolution, and opacity at 0.60–0.75. Finally, connect a Color Harmonization node that performs a subtle global tone match between the garment region and the ambient light color of the photograph, eliminating the slight color temperature mismatch that often exists between a flat-lay photograph taken under studio fluorescent light and a model photograph taken under daylight.
Stage Four: Realism Validation and Pipeline Integration
The validation stage checks three criteria before allowing a composite to exit the workflow. First, a seam detector scans the garment boundary for pixel-level discontinuities — sharp color steps that indicate the mask edge was not blended cleanly. Second, a shadow presence check verifies that shadow pixels exist in the expected contact regions (underbust, sleeve edge, collar neckline). Third, a color distribution comparison checks that the garment region's median luminance is within a configurable tolerance of the model photograph's ambient luminance, catching color harmonization failures.
Outputs that pass all three checks are routed to the product pipeline output node, which formats the image at the target e-commerce resolution (typically 2000×2000 or 1000×1500 for apparel PDPs) and attaches metadata including the garment name, model identifier, and workflow run ID for traceability. Failed outputs are routed to a human review queue with the specific validation failure annotated on the image so the reviewer immediately knows which aspect failed without having to inspect the full composite. Save the workflow as a template and assign category-specific warp intensity presets as template variants — one template for structured outerwear, one for knitwear, one for dresses — so the operations team selects the right preset per product rather than manually adjusting parameters.
Step by step
- 1
Prepare garment and model inputs
In /editor, add two Image Input nodes. Label the first "Garment Flat-Lay" — upload a clean flat-lay photograph on a white or gray background. Label the second "Model Photo" — upload a front-facing model photograph with the intended wearing pose. Ensure both images are at least 1500 px on the shorter dimension.
- 2
Run parallel segmentation
Connect the Garment Input to a Garment Segmentation node (set edge refinement to 0.90). Connect the Model Input to a Body Segmentation node with a pose estimation pass enabled. Set both nodes to execute in parallel by not adding a dependency edge between them — the workflow engine will launch them simultaneously.
- 3
Configure and run the garment warping node
Add a Garment Warping node. Wire inputs: garment image from the Garment Segmentation node, skeletal keypoints from the Body Segmentation node. Set warp intensity based on garment category: 0.60 for structured garments, 0.80 for draped or knit fabrics. The output is a pose-aligned garment image.
- 4
Composite and add shadows
Add a Compositing (Inpainting) node. Wire: pose-aligned garment, model photograph, body region mask. Set inpainting strength to 0.50. Connect the composite output to a Shadow Synthesis node with shadow opacity 0.70 and falloff 12 px, then to a Color Harmonization node.
- 5
Run validation and route to product pipeline
Add a Realism Validator node checking for seam integrity, shadow presence, and luminance match. Wire the pass port to a Product Output node formatted at 2000×2000 px. Wire the fail port to a Human Review Queue node. Save the workflow as a template with category-specific warp presets.
Related guides
Build it on Floniks
Image, video, digital humans, and reusable workflows on one canvas. Sign up gets you starter credits — no card required.
Explore Floniks