重大突破！IDAdapter：首个无需微调，单张图像生成多样和个性化头像方案(北大&格灵深瞳）

发布于 2024-10-22

461

版权声明

我们非常重视原创文章，为尊重知识产权并避免潜在的版权问题，我们在此提供文章的摘要供您初步了解。如果您想要查阅更为详尽的内容，访问作者的公众号页面获取完整文章。

查看原文：重大突破！IDAdapter：首个无需微调，单张图像生成多样和个性化头像方案(北大&格灵深瞳）

文章来源：

AI生成未来

扫码关注公众号

扫码阅读

手机扫码阅读

IDAdapter Summary

IDAdapter: Tuning-Free Personalized Text-to-Image Synthesis Using a Single Face Image

Authors: Siying Cui et al.

Abstract: This paper introduces IDAdapter, an innovative method that enhances diversity and identity preservation in personalized image generation from a single facial image without the need for model tuning. IDAdapter incorporates personalized concepts using text and visual injection, along with a face identity loss, to guide the generation of images with varied styles, expressions, and angles. Extensive evaluation demonstrates the effectiveness of this method compared to earlier models.

Introduction

Progress in the text-to-image (T2I) synthesis field has been significant, especially with the advent of diffusion models like Imagen, DALL-E2, and Stable Diffusion. However, personalizing these models poses challenges, including test-time fine-tuning, multiple input images, low identity preservation, and limited diversity. IDAdapter addresses these by synthesizing features from multiple images of the same person during training, leading to diverse and identity-true image generation without test-time fine-tuning.

Related Work

The development of text-to-image models has advanced with deep generative models like GANs, autoregressive models, and diffusion models. Personalization has been achieved through fine-tuning GANs with multiple face images. Recent methods like DreamBooth optimize the T2I network for higher fidelity. Unadjusted personalization methods train models on domain-specific data to avoid extra fine-tuning during inference.

Method

The method centers on generating a range of vivid images of a person from a single facial image with text prompts. IDAdapter uses a mix of face features and incorporates them into the generative process using an adapter layer. A face identity loss is introduced during training to ensure identity preservation while achieving diversity.

Experiment

Experiments were conducted using the CelebA-HQ database with 30,000 image-text pairs and VGGFace2 for evaluation. IDAdapter was benchmarked against several leading techniques and showed superior performance in identity preservation and diversity without the need for fine-tuning.

Conclusion

IDAdapter represents a breakthrough in personalized avatar generation, enabling the creation of diverse and identity-true images in various styles, angles, and expressions, all without fine-tuning during inference.

To access the full paper, visit arXiv.org.