Skip to content

Commit

Permalink
Merge pull request #85 from PrathamSingla15/Text_invs
Browse files Browse the repository at this point in the history
⚡ Add summary for Textual_inversion
  • Loading branch information
shivank21 authored Mar 13, 2024
2 parents 502d33c + b8b659e commit 61a0f72
Show file tree
Hide file tree
Showing 5 changed files with 41 additions and 0 deletions.
Binary file added images/textual_inversion 1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/textual_inversion 2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/textual_inversion 3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/textual_inversion 4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
41 changes: 41 additions & 0 deletions summaries/Textual_inversion.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion
Rinon Gal1, Yuval Alaluf, Yuval Atzmon, Or Patashnik, Amit H. Bermano, Gal Chechik, Daniel Cohen-Or
**ICVR** **2023**

## Summary
The paper introduces a method which enables users a high level of creative freedom in generating images from natural language prompts by finding new words in the textual embedding space, represented by pseudo-words which helps in introducing new specific concepts into text-to-image models.
<img src='../images/textual_inversion 1.png'>


## Contributions

- Addressing the challenge of large-scale text-to-image models which are constrained by user's ability to describe desired target through text, by introducing personalized text-to-image generation.

- Introduction of 'Textual inversions' as a method to find pseudo words in embedding space for capturing semantics and visual details.

- Address the challenges associated with fine-tuning or re-training models for each new concept while preserving model's existing capabilities.

## Methodology

The primary challenge being addressed in the text is the difficulty of introducing new concepts into large-scale text-to-image models. The common challenges faced include the high cost of re-training models with expanded datasets for each new concept and the problem of catastrophic forgetting when fine-tuning on few examples. The proposed solution aims to tackle these challenges by introducing a method called "Textual Inversion" which works by transforming specific concepts into new made-up words. These made-up words can then be used to describe and create new scenes using simple sentences, making it easy for users to make changes.

<img src='../images/textual_inversion 2.png'>

The approach looks at the word-embedding stage of the text encoders which focuses on visually reconstructing images ensuring that the generated pictures match the desired concepts more accurately.

## Results
The method is tested on Latent Diffusion models and the extensive experiments shows the effectiveness of proposed approach in capturing and representing concepts using natural language, especially with a single word and offering good flexibility.
<img src='../images/textual_inversion 3.png'>
<img src='../images/textual_inversion 4.png'>

## Two-Cents

- The paper provides more freedom in creating images from words. It's good at capturing the general idea of a concept, even though it might not be super precise with details.

- This method makes it easier to edit pictures, but there's a risk that people might use it to create fake images, especially of private individuals. This suggests a need for more research in this area to understand and address these potential misuses.

## Resources
- Link to [Paper](https://arxiv.org/pdf/2208.01618.pdf) and [Code](https://github.com/rinongal/textual_inversion)
- [YouTube video](https://youtu.be/opD_H9bED9Y?si=dMgGO0d_2ClKDUY7)
- [Blog](https://medium.com/@onkarmishra/how-textual-inversion-works-and-its-applications-5e3fda4aa0bc)

0 comments on commit 61a0f72

Please sign in to comment.