Sourced from Azure.Search.Documents's releases.
Azure.Search.Documents_11.6.0
11.6.0 (2024-07-17)
Features Added
- Added support for
2024-07-01
service version.SemanticSearchOptions
now supportsSemanticQuery
, which allows for specifying a semantic query that is only used for semantic reranking.VectorQuery
now supportsOversampling
andWeight
, which allows for specifying richer configurations on how vector queries affect search results.- Added support for
VectorizableTextQuery
, which allows for passing a text-based query that is vectorized service-side byVectorSearchVectorizer
s configured on the index so that vectorization doesn't need to happen before querying.- Added support for "bring your own endpoint" with
VectorSearchVectorizer
, with implementationsAzureOpenAIVectorizer
andWebApiVectorizer
. This enables the service to use a user-provided configuration for vectorizing text, rather than requiring all client-side calls to vectorize before querying, allowing for easier standardization of vectorization.- Added support for compression with
VectorSearchCompression
, with implementationsBinaryQuantizationCompression
andScalarQuantizationCompression
. This allows for reducing the size of vectors in the index, which can reduce storage costs and improve querying performance.- Added support for
VectorEncodingFormat
, which allows for specifying the encoding format of the vector data.- Added support for
AzureOpenAIEmbeddingSkill
, which is a skill that uses the Azure OpenAI service to create text embeddings during indexing.- Added support for index projections with
SearchIndexerIndexProjection
, which allows for specifying how indexed documents are projected in the index (or indexes).- Added support for "narrow" types in
SearchFieldDataType
. This allows for specifying smaller types for vector fields to reduce storage costs and improve querying performance.- Added support for
SearchIndexerDataIdentity
, which allows for specifying the identity for the data source for the indexer.SearchField
andSearchableField
now supportIsStored
andVectorEncodingFormat
configurations.IsStored
allows for specifying behaviors on how the index will retain vector data (enabling the ability to reduce storage costs), andVectorEncodingFormat
allows for specifying the encoding format of the vector data.OcrSkill
now supportsLineEnding
, which allows for specifying the line ending character used by the OCR skill.SplitSkill
now supportsMaximumPagesToTake
andPageOverlapLength
, which allows for specifying how the split skill behaves when splitting documents into pages.SearchServiceLimits
now supportsMaxStoragePerIndexInBytes
, which shows the maximum storage allowed per index.Breaking Changes
- All service concepts that have been in preview but not included in the
2024-07-01
GA have been removed. This includes concepts such as index aliases, normalizers, Azure Machine Learning skills, hybrid search, and more.
65da6f2
[Search] GA Features for API Version 2024-07-01 (#44485)8571d3c
Increment package version after release of Azure.Messaging.EventGrid (#45056)0ac2d02
Update Changelog and readme files to include --prerelease (#45047)b0ca6d6
Increment version for storage releases (#45054)374ca1c
[Storage] [Webjobs Extension] Updated changelog to prepare for WebJobs
Storag...8f514a8
Increment version for storage releases (#45045)4e073bb
Increment package version after release of
Azure.Monitor.OpenTelemetry.AspNet...3c581b3
Changed DataMovement Blobs and File Shares to use package dependency,
Added F...2a9cffa
[AzureMonitorDistro] prep distro 1.3.0-beta.1 (#44991)3dcd049
Updated Changelog for DataMovement July Release (#45040)Sourced from pyarrow's releases.
Apache Arrow 17.0.0
Release Notes URL: https://arrow.apache.org/release/17.0.0.html
Apache Arrow 17.0.0 RC2
Release Notes: Release Candidate: 17.0.0 RC2
Apache Arrow 17.0.0 RC1
Release Notes: Release Candidate: 17.0.0 RC1
Apache Arrow 17.0.0 RC0
Release Notes: Release Candidate: 17.0.0 RC0
6a2e19a
MINOR: [Release] Update versions for 17.0.01a2fff4
MINOR: [Release] Update .deb/.rpm changelogs for 17.0.09d4cccc
MINOR: [Release] Update CHANGELOG.md for 17.0.0bf75923
GH-43204:
[CI][Packaging] Apply vcpkg patch to fix Thrift version (#43208)e85767a
GH-41541:
[Go][Parquet] More fixes for writer performance regression (#42003)5c69895
GH-43199:
[CI][Packaging] dev/release/utils-create-release-tarball.sh should
...12be569
GH-42149:
[C++] Use FetchContent for bundled ORC (#43011)58d5142
GH-43158:
[Packaging] Use bundled nlohmann/json on AlmaLinux 8/CentOS Stream
...56a9862
GH-41910:
[Python] Add support for Pyodide (#37822)14e4684
GH-43116:
[C++][Compute] Mark KeyCompare.CompareColumnsToRowsLarge as large
m...Sourced from pydantic's releases.
v2.8.2 (2024-07-03)
What's Changed
Fixes
- Fix issue with assertion caused by pluggable schema validator by
@dmontagu
in #9838Full Changelog: https://github.com/pydantic/pydantic/compare/v2.8.1...v2.8.2
v2.8.1 (2024-07-03)
What's Changed
Packaging
- Bump
ruff
tov0.5.0
andpyright
tov1.1.369
by@sydney-runkle
in #9801- Bump
pydantic-core
tov2.20.1
,pydantic-extra-types
tov2.9.0
by@sydney-runkle
in #9832Fixes
- Fix breaking change in
to_snake
from v2.7 -> v2.8 by@sydney-runkle
in #9812- Fix list constraint json schema application by
@sydney-runkle
in #9818- Support time duration more than 23 by
@nix010
in pydantic/speedate#64- Fix millisecond fraction being handled with the wrong scale by
@davidhewitt
in pydantic/speedate#65- Handle negative fractional durations correctly by
@sydney-runkle
in pydantic/speedate#71New Contributors
@kwint
made their first contribution in pydantic/pydantic#9787@seekinginfiniteloop
made their first contribution in pydantic/pydantic#9822Full Changelog: https://github.com/pydantic/pydantic/compare/v2.8.0...v2.8.1
Sourced from pydantic's changelog.
v2.8.2 (2024-07-03)
What's Changed
Fixes
- Fix issue with assertion caused by pluggable schema validator by
@dmontagu
in #9838v2.8.1 (2024-07-03)
What's Changed
Packaging
- Bump
ruff
tov0.5.0
andpyright
tov1.1.369
by@sydney-runkle
in #9801- Bump
pydantic-core
tov2.20.1
,pydantic-extra-types
tov2.9.0
by@sydney-runkle
in #9832Fixes
- Fix breaking change in
to_snake
from v2.7 -> v2.8 by@sydney-runkle
in #9812- Fix list constraint json schema application by
@sydney-runkle
in #9818- Support time duration more than 23 by
@nix010
in pydantic/speedate#64- Fix millisecond fraction being handled with the wrong scale by
@davidhewitt
in pydantic/speedate#65- Handle negative fractional durations correctly by
@sydney-runkle
in pydantic/speedate#71
4978ee2
update history0345929
v bumpd390a04
Fix issue with assertion caused by pluggable schema validator (#9838)040865f
update history5a33e3b
bump version2f9abb2
Bump pydantic-core
to v2.20.1
,
pydantic-extra-types
to v2.9.0
(#9832)ce9c5f7
Remove spooky meetings file (#9824)6bdd6d1
Pedantic typo correction within explanation of Pydantic's root in
'pedantic' ...701ccde
Fix list constraint json schema application (#9818)2a066a2
Bump ruff
to v0.5.0
and pyright
to v1.1.369
(#9801)Sourced from DocumentFormat.OpenXml's releases.
[3.1.0]
Added
- Added
DocumentFormat.OpenXml.Office.SpreadSheetML.Y2024.PivotAutoRefresh
namespace- Added
DocumentFormat.OpenXml.Office.SpreadSheetML.Y2024.PivotDynamicArrays
namespace- Added
DocumentFormat.OpenXml.Office.SpreadSheetML.Y2023.DataSourceVersioning
namespace- Added
DocumentFormat.OpenXml.Office.SpreadSheetML.Y2023.ExternalCodeService
namespace- Added
DocumentFormat.OpenXml.Office.SpreadSheetML.Y2023.MsForms
namespace- Added
DocumentFormat.OpenXml.Office.SpreadSheetML.Y2023.Pivot2023Calculation
namespaceFixed
Sourced from DocumentFormat.OpenXml's changelog.
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
522f9f3
Unload root element if Part.GetStream updates the underlying value (#1760)f1fecd3
Office Update for May 2024 (#1759)cd814ec
Change commentextensionlist ext element to 2006/main namespace (#1754)27ba922
Make LINQ classes partial (#1756)2c3d5d2
Bump gittools/actions from 1 to 2 (#1753)d9ea8cd
Bump danielpalme/ReportGenerator-GitHub-Action from 5.3.0 to 5.3.6 (#1734)062ea5e
Bump xunit.runner.visualstudio from 2.8.0 to 2.8.1 (#1733)b78695b
Bump Microsoft.NET.Test.Sdk from 17.9.0 to 17.10.0 (#1727)8e28b6d
Bump NuGet.Resolver, NuGet.Protocol, NuGet.Packaging and NuGet.Common
(#1725)b1b6682
add Feb24 Schema updates without specialized <Feature>Extension
and <Feature>...Sourced from Microsoft.OpenApi.Readers's releases.
1.6.16
Changes:
- #1761: Merges
vnext
intomaster
- #1760: Remove packages potentially causing build to fail in AzDo pipeline
- #1758: Bump Verify.Xunit from 26.1.1 to 26.1.2
- #1755: Update code reviewers
- #1754: Add missing nuget lib refs
- #1751: Releases Hidi
- #1753: Bump Verify.Xunit from 26.0.1 to 26.1.1
- #1750: Bumps up conversion lib version
- #1747: Release libs
- #1746: Bump up lib versions
- #1744: Fix Initialization of StreamReader in OpenApiStreamReader throws an ArgumentNullException
- #1745: Bump Verify.Xunit from 26.0.0 to 26.0.1
- #1743: Bump Verify.Xunit from 25.3.2 to 26.0.0
- #1741: Bump docker/build-push-action from 6.4.0 to 6.5.0
- #1742: Bump docker/login-action from 3.2.0 to 3.3.0
- #1717: Make OpenAPI.NET library trim-compatible
- #1736: Fix copy constructor for arrays and objects
- #1737: add verify settings to editorconfig
- #1739: Remove redundant section from README
- #1734: Fix OpenApiFilterService.CreateFilteredDocument failing when the Components field is missing.
- #1731: Bump Verify.Xunit from 25.3.1 to 25.3.2
- #1728: Bump docker/build-push-action from 6.3.0 to 6.4.0
- #1718: make verified files as lf and utf8
- #1725: Bump Verify.Xunit from 25.3.0 to 25.3.1
- #1724: Bump Microsoft.Windows.Compatibility from 8.0.6 to 8.0.7
- #1722: Bump xunit from 2.8.1 to 2.9.0
- #1721: Bump docker/build-push-action from 6.2.0 to 6.3.0
- #1723: Bump xunit.runner.visualstudio from 2.8.1 to 2.8.2
- #1720: Bump dependabot/fetch-metadata from 2.1.0 to 2.2.0
- #1716: Bump Verify.Xunit from 25.2.0 to 25.3.0
- #1714: Bump Verify.Xunit from 25.0.4 to 25.2.0
- #1713: Bump docker/build-push-action from 6.1.0 to 6.2.0
- #1703: Releases Hidi
- #1708: Resolve merge in #1703
- #1690: Release Hidi and libs
- #1704: Resolve merge conflicts
- #1685: Bump Verify.Xunit from 24.2.0 to 25.0.1
- #1683: Bump docker/login-action from 3.1.0 to 3.2.0
- #1689: Updates conversion lib. version
- #1702: Bumps up conversion library version
- #1698: Fixes null extension bug
- #1697: Fix invalid reference bug
- #1701: Bump Verify.Xunit from 25.0.3 to 25.0.4
... (truncated)
900d04f
Merge pull request #1761
from microsoft/vnext08c949c
Merge pull request #1760
from microsoft/is/remove-package0feb41e
Merge remote-tracking branch 'origin/vnext' into is/remove-packagececd9e7
Remove packages potentially causing build to fail in AzDo pipeline048abb2
Merge pull request #1758
from microsoft/dependabot/nuget/Verify.Xunit-26.1.2ad44ab8
Bump Verify.Xunit from 26.1.1 to 26.1.2afbc27d
Merge pull request #1755
from microsoft/is/review-code-reviewersa7da3a3
Merge pull request #1754
from microsoft/is/update-missing-pkgecec3dc
Replace Carlos with Gavin as code reviewer293e6c2
Remove added nuget packageSourced from Microsoft.OpenApi's releases.
1.6.16
Changes:
- #1761: Merges
vnext
intomaster
- #1760: Remove packages potentially causing build to fail in AzDo pipeline
- #1758: Bump Verify.Xunit from 26.1.1 to 26.1.2
- #1755: Update code reviewers
- #1754: Add missing nuget lib refs
- #1751: Releases Hidi
- #1753: Bump Verify.Xunit from 26.0.1 to 26.1.1
- #1750: Bumps up conversion lib version
- #1747: Release libs
- #1746: Bump up lib versions
- #1744: Fix Initialization of StreamReader in OpenApiStreamReader throws an ArgumentNullException
- #1745: Bump Verify.Xunit from 26.0.0 to 26.0.1
- #1743: Bump Verify.Xunit from 25.3.2 to 26.0.0
- #1741: Bump docker/build-push-action from 6.4.0 to 6.5.0
- #1742: Bump docker/login-action from 3.2.0 to 3.3.0
- #1717: Make OpenAPI.NET library trim-compatible
- #1736: Fix copy constructor for arrays and objects
- #1737: add verify settings to editorconfig
- #1739: Remove redundant section from README
- #1734: Fix OpenApiFilterService.CreateFilteredDocument failing when the Components field is missing.
- #1731: Bump Verify.Xunit from 25.3.1 to 25.3.2
- #1728: Bump docker/build-push-action from 6.3.0 to 6.4.0
- #1718: make verified files as lf and utf8
- #1725: Bump Verify.Xunit from 25.3.0 to 25.3.1
- #1724: Bump Microsoft.Windows.Compatibility from 8.0.6 to 8.0.7
- #1722: Bump xunit from 2.8.1 to 2.9.0
- #1721: Bump docker/build-push-action from 6.2.0 to 6.3.0
- #1723: Bump xunit.runner.visualstudio from 2.8.1 to 2.8.2
- #1720: Bump dependabot/fetch-metadata from 2.1.0 to 2.2.0
- #1716: Bump Verify.Xunit from 25.2.0 to 25.3.0
- #1714: Bump Verify.Xunit from 25.0.4 to 25.2.0
- #1713: Bump docker/build-push-action from 6.1.0 to 6.2.0
- #1703: Releases Hidi
- #1708: Resolve merge in #1703
- #1690: Release Hidi and libs
- #1704: Resolve merge conflicts
- #1685: Bump Verify.Xunit from 24.2.0 to 25.0.1
- #1683: Bump docker/login-action from 3.1.0 to 3.2.0
- #1689: Updates conversion lib. version
- #1702: Bumps up conversion library version
- #1698: Fixes null extension bug
- #1697: Fix invalid reference bug
- #1701: Bump Verify.Xunit from 25.0.3 to 25.0.4
... (truncated)
900d04f
Merge pull request #1761
from microsoft/vnext08c949c
Merge pull request #1760
from microsoft/is/remove-package0feb41e
Merge remote-tracking branch 'origin/vnext' into is/remove-packagececd9e7
Remove packages potentially causing build to fail in AzDo pipeline048abb2
Merge pull request #1758
from microsoft/dependabot/nuget/Verify.Xunit-26.1.2ad44ab8
Bump Verify.Xunit from 26.1.1 to 26.1.2afbc27d
Merge pull request #1755
from microsoft/is/review-code-reviewersa7da3a3
Merge pull request #1754
from microsoft/is/update-missing-pkgecec3dc
Replace Carlos with Gavin as code reviewer293e6c2
Remove added nuget package[In Memory: FLAT / IVF_FLAT / IVF_SQ8 / IVF_PQ / HNSW / SCANN](https://milvus.io/docs/index.md)
[On Disk: DiskANN](https://milvus.io/docs/disk_index.md)
[GPU: GPU_CAGRA / GPU_IVF_FLAT / GPU_IVF_PQ / GPU_BRUTE_FORCE](https://milvus.io/docs/gpu_index.md)
| + +Footnotes: +- HNSW = Hierarchical Navigable Small World (HNSW performs an [approximate nearest neighbor (ANN)](https://learn.microsoft.com/en-us/azure/search/vector-search-overview#approximate-nearest-neighbors) search) +- KNN = k-nearest neighbors (performs a brute-force search that scans the entire vector space) +- IVFFlat = Inverted File with Flat Compression (This index type uses approximate nearest neighbor search (ANNS) to provide fast searches) +- Weaviate Dynamic = Starts as flat and switches to HNSW if the number of objects exceed a limit +- PGA = [Pinecone Graph Algorithm](https://www.pinecone.io/blog/hnsw-not-enough/) + +### Vector Store Cross Store support - Search and filtering + +|Feature|Azure AI Search|Weaviate|Redis|Chroma|FAISS|Pinecone|LLamaIndex|PostgreSql|Qdrant|Milvus| +|-|-|-|-|-|-|-|-|-|-|-| +|Index allows text search|Y|Y|Y|Y (On Metadata by default)||[Only in combination with Vector](https://docs.pinecone.io/guides/data/understanding-hybrid-search)||Y (with TSVECTOR field)|Y|Y| +|Text search query format|[Simple or Full Lucene](https://learn.microsoft.com/en-us/azure/search/search-query-create?tabs=portal-text-query#choose-a-query-type-simple--full)|[wildcard](https://weaviate.io/developers/weaviate/search/filters#filter-text-on-partial-matches)|wildcard & fuzzy|[contains & not contains](https://docs.trychroma.com/guides#filtering-by-document-contents)||Text only||[wildcard & binary operators](https://www.postgresql.org/docs/16/textsearch-controls.html#TEXTSEARCH-PARSING-QUERIES)|[Text only](https://qdrant.tech/documentation/concepts/filtering/#full-text-match)|[wildcard](https://milvus.io/docs/single-vector-search.md#Filtered-search)| +|Multi Field Vector Search Support|Y|[N](https://weaviate.io/developers/weaviate/search/similarity)||N (no multi vector support)||N||[Unclear due to order by syntax](https://github.com/pgvector/pgvector?tab=readme-ov-file#querying)|[N](https://qdrant.tech/documentation/concepts/search/)|[Y](https://milvus.io/api-reference/restful/v2.4.x/v2/Vector%20(v2)/Hybrid%20Search.md)| +|Targeted Multi Field Text Search Support|Y|[Y](https://weaviate.io/developers/weaviate/search/hybrid#set-weights-on-property-values)|[Y](https://redis.io/docs/latest/develop/interact/search-and-query/advanced-concepts/query_syntax/#field-modifiers)|N (only on document)||N||Y|Y|Y| +|Vector per Vector Field for Search|Y|N/A||N/A|||N/A||N/A|N/A|[Y](https://milvus.io/docs/multi-vector-search.md#Step-1-Create-Multiple-AnnSearchRequest-Instances)| +|Separate text search query from vectors|Y|[Y](https://weaviate.io/developers/weaviate/search/hybrid#specify-a-search-vector)|Y|Y||Y||Y|Y|[Y](https://milvus.io/api-reference/restful/v2.4.x/v2/Vector%20(v2)/Hybrid%20Search.md)| +|Allows filtering|Y|Y|Y (on TAG)|Y (On Metadata by default)||[Y](https://docs.pinecone.io/guides/indexes/configure-pod-based-indexes#selective-metadata-indexing)||Y|Y|Y| +|Allows filter grouping|Y (Odata)|[Y](https://weaviate.io/developers/weaviate/search/filters#nested-filters)||[Y](https://docs.trychroma.com/guides#using-logical-operators)||Y||Y|[Y](https://qdrant.tech/documentation/concepts/filtering/#clauses-combination)|[Y](https://milvus.io/docs/get-and-scalar-query.md#Use-Basic-Operators)| +|Allows scalar index field setup|Y|Y|Y|N||Y||Y|Y|Y| +|Requires scalar index field setup to filter|Y|Y|Y|N||N (on by default for all)||N|N|N (can filter without index)| + +### Support for different mappers + +Mapping between data models and the storage models can also require custom logic depending on the type of data model and storage model involved. + +I'm therefore proposing that we allow mappers to be injectable for each `VectorStoreCollection` instance. The interfaces for these would vary depending +on the storage models used by each vector store and any unique capabilities that each vector store may have, e.g. qdrant can operate in `single` or +`multiple named vector` modes, which means the mapper needs to know whether to set a single vector or fill a vector map. + +In addition to this, we should build first party mappers for each of the vector stores, which will cater for built in, generic models or use metadata to perform the mapping. + +### Support for different storage schemas + +The different stores vary in many ways around how data is organized. +- Some just store a record with fields on it, where fields can be a key or a data field or a vector and their type is determined at collection creation time. +- Others separate fields by type when interacting with the api, e.g. you have to specify a key explicitly, put metadata into a metadata dictionary and put vectors into a vector array. + +I'm proposing that we allow two ways in which to provide the information required to map data between the consumer data model and storage data model. +First is a set of configuration objects that capture the types of each field. Second would be a set of attributes that can be used to decorate the model itself +and can be converted to the configuration objects, allowing a single execution path. +Additional configuration properties can easily be added for each type of field as required, e.g. IsFilterable or IsFullTextSearchable, allowing us to also create an index from the provided configuration. + +I'm also proposing that even though similar attributes already exist in other systems, e.g. System.ComponentModel.DataAnnotations.KeyAttribute, we create our own. +We will likely require additional properties on all these attributes that are not currently supported on the existing attributes, e.g. whether a field is or +should be filterable. Requiring users to switch to new attributes later will be disruptive. + +Here is what the attributes would look like, plus a sample use case. + +```cs +sealed class VectorStoreRecordKeyAttribute : Attribute +{ +} +sealed class VectorStoreRecordDataAttribute : Attribute +{ + public bool HasEmbedding { get; set; } + public string EmbeddingPropertyName { get; set; } +} +sealed class VectorStoreRecordVectorAttribute : Attribute +{ +} + +public record HotelInfo( + [property: VectorStoreRecordKey, JsonPropertyName("hotel-id")] string HotelId, + [property: VectorStoreRecordData, JsonPropertyName("hotel-name")] string HotelName, + [property: VectorStoreRecordData(HasEmbedding = true, EmbeddingPropertyName = "DescriptionEmbeddings"), JsonPropertyName("description")] string Description, + [property: VectorStoreRecordVector, JsonPropertyName("description-embeddings")] ReadOnlyMemory
+/// {
+/// "Term": "API",
+/// "Definition": "Application Programming Interface. A set of rules and specifications that allow software components to communicate and exchange data.",
+/// "DefinitionEmbedding": [ ... ]
+/// }
+///
+/// However, the data model is a class with a property for key and two dictionaries for the data (Term and Definition) and vector (DefinitionEmbedding).
+///
+/// The example shows the following steps:
+/// 1. Create an embedding generator.
+/// 2. Create a Redis Vector Store using a custom factory for creating collections.
+/// When constructing a collection, the factory injects a custom mapper that maps between the data model and the storage model if required.
+/// 3. Ingest some data into the vector store.
+/// 4. Read the data back from the vector store.
+///
+/// You need a local instance of Docker running, since the associated fixture will try and start a Redis container in the local docker instance to run against.
+///