BEGIN:VCALENDAR VERSION:2.0 PRODID:Linklings LLC BEGIN:VTIMEZONE TZID:Asia/Tokyo X-LIC-LOCATION:Asia/Tokyo BEGIN:STANDARD TZOFFSETFROM:+0900 TZOFFSETTO:+0900 TZNAME:JST DTSTART:18871231T000000 END:STANDARD END:VTIMEZONE BEGIN:VEVENT DTSTAMP:20250110T023312Z LOCATION:Hall B5 (2)\, B Block\, Level 5 DTSTART;TZID=Asia/Tokyo:20241204T134600 DTEND;TZID=Asia/Tokyo:20241204T135800 UID:siggraphasia_SIGGRAPH Asia 2024_sess116_papers_330@linklings.com SUMMARY:Generative Portrait Shadow Removal DESCRIPTION:Technical Papers\n\nJae Shin Yoon, Zhixin Shu, Mengwei Ren, Xu aner Zhang, Yannick Hold-Geoffroy, Krishna Kumar Singh, and He Zhang (Adob e Inc.)\n\nWe introduce a high-fidelity portrait shadow removal model that can effectively enhance the image of a portrait by predicting its appeara nce under disturbing shadows and highlights. Portrait shadow removal is a highly ill-posed problem where multiple plausible solutions can be found b ased on a single image. For example, disentangling complex environmental l ighting from original skin color is a non-trivial problem. While existing works have solved this problem by predicting the appearance residuals that can propagate local shadow distribution, such methods are often incomplet e and lead to unnatural predictions, especially for portraits with hard sh adows. We overcome the limitations of existing local propagation methods b y formulating the removal problem as a generation task where a diffusion m odel learns to globally rebuild the human appearance from scratch as a con dition of an input portrait image. For robust and natural shadow removal, we propose to train the diffusion model with a compositional repurposing f ramework: a pre-trained text-guided image generation model is first fine-t uned to harmonize the lighting and color of the foreground with a backgrou nd scene by using a background harmonization dataset; and then the model i s further fine-tuned to generate a shadow-free portrait image via a shadow -paired dataset. To overcome the limitation of losing fine details in the latent diffusion model, we propose a guided-upsampling network to restore the original high-frequency details (e.g., wrinkles and dots) from the inp ut image. To enable our compositional training framework, we construct a h igh-fidelity and large-scale dataset using a lightstage capturing system a nd synthetic graphics simulation. Our generative framework effectively rem oves shadows caused by both self and external occlusions while maintaining original lighting distribution and high-frequency details. Our method als o demonstrates robustness to diverse subjects captured in real environment s.\n\nRegistration Category: Full Access, Full Access Supporter\n\nLanguag e Format: English Language\n\nSession Chair: Dani Lischinski (Hebrew Unive rsity of Jerusalem, Google) URL:https://asia.siggraph.org/2024/program/?id=papers_330&sess=sess116 END:VEVENT END:VCALENDAR