OA Retrieval System Proposal #3462
Replies: 14 comments
-
Beta Was this translation helpful? Give feedback.
-
Also added a POC I had done: REALM encoded wikipedia data |
Beta Was this translation helpful? Give feedback.
-
Hey, Just to repost demo from discord where I tried out @kenhktsui POC with plugins as a POC also |
Beta Was this translation helpful? Give feedback.
-
I can also contribute here. |
Beta Was this translation helpful? Give feedback.
-
I can contribute here. |
Beta Was this translation helpful? Give feedback.
-
(I have unassigned myself in favor of umbra-scientia since gh has a 10 people assignment limit) |
Beta Was this translation helpful? Give feedback.
-
I think the LLM can decide of if it wants to use a retrieval query. To train the model to use it we could add some flags in the data labeling interface, like "requires information retrieval". Disadvantage of that is that it has to be clear for the labelers what knowledge can be expected from the model and what it should look up. I think the query may very well be just the original user message, a good vector DB with good embeddings should already return the text snippets that are most relevant for that query, ranked by similarity. So I would say use as many retrieved text snippets as possible with the context length leaving some room for the reply. Obviously that information wouldn't be included in the chat context, only the final answer. I have some questions about the general plan:
Anyway I have some experience setting up things like this. I'd be honored to contribute! |
Beta Was this translation helpful? Give feedback.
-
Adding one more consideration here: there are (at least) three ways of incorporating retrieval into LLM, with different degrees of coupling.
|
Beta Was this translation helpful? Give feedback.
-
I think there is a 4th line, kind of a blend of 1 and 3, as presented in the RETRO paper (https://arxiv.org/abs/2112.04426)[https://arxiv.org/abs/2112.04426]:
|
Beta Was this translation helpful? Give feedback.
-
Meeting minutes:
We all agreed to spend another week of paper reading, discord chats and exploring small tasks. @kpoeppel will share some papers on retrieval Please comment if I missed something |
Beta Was this translation helpful? Give feedback.
-
Another way to incorporate retrieval, sort of an upgrade to 3, is to take a pretrained non-retrieval LLM and fine-tune it with retrieval augmentation simply adding retrieved documents into the input. You can either use a pretrained retriever like RETRO, co-train a retriever like REALM during this fine-tuning stage, or use a nonparametric retriever like BM25 which works surprisingly well. This method was introduced by a very recent paper, though I'm blanking out on the name. Hopefully someone will be able to identify this particular paper. |
Beta Was this translation helpful? Give feedback.
-
Meeting MinutesEmbedding Method Team
Prompt-Injection TeamNo updates yet |
Beta Was this translation helpful? Give feedback.
-
Some clarification: After those first experiments we can later extrapolate to llama-30B. |
Beta Was this translation helpful? Give feedback.
-
High Level OA Retrieval System
Goal
Options available
use a professional vector-db in which we index documents based on embeddings, like for example all of wikipedia
Segment the data into chunks (sentences/para)
Generate embeddings for each
Store the embeddings for retrieval ( FAISS,etc)
When presented with query retrieve related chunks from DB using some
metrics, for example cosine similarity
Prompt LLM using query + retrieved chunks to generate the answer
https://paperswithcode.com/dataset/beir
LangChain being considered ?
LLamaIndex ?
VectorDB(s) under consideration
Benchmarks : http://ann-benchmarks.com/ ?
Draw backs :
Design or Workflow
Overall there are some similarities between retrieval and OA plugins (i.e. in the simplest case retrieval could be a plugin). The retrieval system will be a bit more closely integrated to the inference system for the easily updatable knowledge of the assistant
Need to come to a consensus on the workflow
Other design thoughts
There are 2 schools of thought for this system
Vs
the use-case of retrieval based models are mostly in knowledge seeking mode
Open questions
Timeline for First Version
TBD
Beta Was this translation helpful? Give feedback.
All reactions