diff --git a/summaries/Sentence-BERT.md b/summaries/Sentence-BERT.md new file mode 100644 index 0000000..66f0f88 --- /dev/null +++ b/summaries/Sentence-BERT.md @@ -0,0 +1,30 @@ +# Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks + +## Summary +"Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks" introduces a novel approach to computing sentence embeddings using BERT. Traditional BERT models are computationally expensive for generating sentence embeddings, and this paper proposes a more efficient method by fine-tuning BERT using a Siamese network structure. This allows for quick and effective comparison of sentence embeddings in various tasks like semantic textual similarity, clustering, and information retrieval. + +## Contributions +- **Efficient Sentence Embeddings**: Proposed a method to generate sentence embeddings quickly and efficiently using Siamese BERT networks. +- **Performance Improvements**: Demonstrated significant improvements in computational efficiency and performance on various sentence-level tasks. +- **Benchmarking**: Provided extensive benchmarking against existing methods and state-of-the-art (SOTA) techniques. + +## Method +1. **Siamese Network Architecture**: The paper utilizes a Siamese network structure where two BERT models are used to encode two sentences independently. +2. **Objective Function**: The network is fine-tuned using a triplet loss or contrastive loss to ensure that similar sentences have closer embeddings in the vector space, and dissimilar sentences are further apart. For example: + - `softmax(Wt(u, v, |u − v|))` + - `max(||sa − sp|| − ||sa − sn|| + E, 0)` +3. **Fine-Tuning**: The model is fine-tuned on a variety of datasets to capture semantic similarity, making it versatile for multiple applications. +4. **Pooling Strategy**: Sentence embeddings are derived using different pooling strategies, including mean pooling, max pooling, and CLS token pooling. + +![Figure: Siamese BERT Network Architecture (adapted from the paper)](path/to/figure.png) + +## Results +- **Performance Metrics**: The paper shows improvements in various benchmarks, including the Semantic Textual Similarity (STS) tasks and the SentEval benchmark. +- **Comparison with SOTA**: Sentence-BERT outperforms existing methods in terms of both efficiency and accuracy. Specifically, it shows a substantial reduction in computational time while maintaining or improving accuracy. +- **Visualization**: T-SNE plots demonstrate that Sentence-BERT embeddings cluster semantically similar sentences closer together compared to other methods. + +## Two-Cents +- **Appreciation**: The paper successfully addresses a significant limitation of BERT by improving the efficiency of generating sentence embeddings. The use of Siamese networks is a clever approach that balances performance with computational cost. + +## Resources +- **Paper**: [https://arxiv.org/abs/1908.10084](https://arxiv.org/abs/1908.10084)