Please, how can I upsert a dictionary of records with Azure AI Search Collection memory datamodel. Thank you! #9878
-
Hello Everyone, Following the subject above, here is my datamodel: @vectorstoremodel
@dataclass
class TaskMonitorChatVectorModelClass:
chat_id: Annotated[str, VectorStoreRecordKeyField()] = field(default_factory=lambda: str(uuid4()))
chat_name: Annotated[str | None, VectorStoreRecordDataField(property_type="str", is_full_text_searchable=True, is_filterable=True)] = "TaskMonitorChat"
chat_content_type: Annotated[str | None, VectorStoreRecordDataField(property_type="str", is_filterable=True)] = "text"
messages_vector: Annotated[
np.ndarray | None,
VectorStoreRecordVectorField(
embedding_settings={"embedding": OpenAIEmbeddingPromptExecutionSettings(dimensions=1536)},
index_kind=IndexKind.HNSW,
dimensions=1536,
property_type="float",
serialize_function=np.ndarray.tolist,
deserialize_function=np.array,
),
] = None
messages: Annotated[
List[Dict[str, Optional[str | Dict[str, str]]]],
VectorStoreRecordDataField(
has_embedding=True,
embedding_property_name="messages_vector",
property_type="str",
is_full_text_searchable=True,
is_filterable=True,
),
] = field(default_factory=lambda: [
{
"text": "Hi TaskMonitorAI, I'm having trouble with a script that's supposed to schedule tasks, but it's failing to execute as expected.",
"role": "user",
"metadata": {
"text_datetime": "2024-11-29T20:46:37.897772+00:00"
}
},
{
"text": "Hello! I can help with that. Can you provide the error message you're receiving?",
"role": "assistant",
"metadata": {
"text_datetime": "2024-11-29T20:46:37.897772+00:00"
}
}
]
)
DataModel = TaskMonitorChatVectorModelClass
Thank you in advance! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
@eavanvalkenburg could you please take a look at this one? |
Beta Was this translation helpful? Give feedback.
-
@selfishark the built-in vector store utils does not know how to handle this scenario so I would expect some errors, particularly because it is unclear what should be embedded, should there be a vector for each message content, or one vector for all? Also depending on the data model this might not work as you seem to want to do it, if you want a single vector for all messages, then this model works, if you want a vector per message then you would have to split this into seperate records and then upsert all of them, depending on how you want to use them this might make sense, but that is up to you to decide! You can take inspiration from the vector store utils class on how to make it work with sub-paths, I have not looked at that, but if you see something in there which makes sense to do generally feel free to make a PR or let us know! |
Beta Was this translation helpful? Give feedback.
@selfishark the built-in vector store utils does not know how to handle this scenario so I would expect some errors, particularly because it is unclear what should be embedded, should there be a vector for each message content, or one vector for all? Also depending on the data model this might not work as you seem to want to do it, if you want a single vector for all messages, then this model works, if you want a vector per message then you would have to split this into seperate records and then upsert all of them, depending on how you want to use them this might make sense, but that is up to you to decide! You can take inspiration from the vector store utils class on how to make it work …