You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This constraint causes issues when multiple Site records with the same name attempt to create a lod_link pointing to the same Wikidata entity. Since the unique index does not account for the linkable_id and linkable_type columns in the polymorphic association, the following problems arise:
Unique Constraint Violation:
When two or more Site records try to link to the same source and external_id, a PG::UniqueViolation error occurs: ActiveRecord::RecordNotUnique (PG::UniqueViolation: ERROR: duplicate key value violates unique constraint "index_lod_links_on_source_and_external_id")
Data Integrity Issues:
The unique constraint does not align with the polymorphic nature of lod_links, leading to limitations in associating multiple Site records with the same Wikidata entity.
Current Workaround
To address this issue temporarily in production, a monkey patch was applied. The patch prevents duplicate creation by using find_or_initialize_by and skipping record creation if the lod_link already exists. This is not a long-term solution and only mitigates the problem in runtime code.
Proposed Solution
A permanent fix should be implemented in the master branch to properly handle the polymorphic nature of lod_links:
Update the Unique Index:
Modify the unique index to include linkable_type and linkable_id, ensuring that source and external_id are unique within the scope of a specific linkable record. Migration example:
class UpdateIndexOnLodLinksToPolymorphic < ActiveRecord::Migration[7.0]
def change
remove_index :lod_links, name: "index_lod_links_on_source_and_external_id"
add_index :lod_links, [:linkable_type, :linkable_id, :source, :external_id],
unique: true, name: "index_lod_links_on_polymorphic_source_and_external_id"
end
end
Adjust Model Validations:
Ensure that the model reflects this uniqueness at the application level:
Refactor Batch Processing Logic:
Update the wikidata_match_candidates_batch method to handle these polymorphic constraints gracefully and avoid creating unnecessary duplicates:
def self.wikidata_match_candidates_batch(sites)
ActiveRecord::Base.transaction do
wikidata_results.each do |site_name, matches|
sites_for_name = sites.select { |s| s.name == site_name }
matches.each do |match|
lod_link = LodLink.find_or_create_by!(
source: "Wikidata",
external_id: match.qid,
linkable_type: "Site",
linkable_id: site.id
) do |link|
link.data = { label: match.label, description: match.description }
end
sites_for_name.each do |site|
site.lod_links << lod_link unless site.lod_links.include?(lod_link)
end
end
end
end
end
Action Items
Update the lod_links schema to include a polymorphic index.
Refactor the wikidata_match_candidates_batch method to align with the updated schema.
Remove the monkey-patch from production once the changes are deployed to master.
Priority
High: The current issue causes runtime errors in production and relies on a temporary workaround. Proper resolution is needed to ensure database integrity and prevent duplicate errors.
The text was updated successfully, but these errors were encountered:
Problem Description
The current implementation of the lod_links table enforces a unique constraint on the combination of source and external_id:
t.index ["source", "external_id"], name: "index_lod_links_on_source_and_external_id", unique: true
This constraint causes issues when multiple Site records with the same name attempt to create a lod_link pointing to the same Wikidata entity. Since the unique index does not account for the linkable_id and linkable_type columns in the polymorphic association, the following problems arise:
ActiveRecord::RecordNotUnique (PG::UniqueViolation: ERROR: duplicate key value violates unique constraint "index_lod_links_on_source_and_external_id")
Current Workaround
To address this issue temporarily in production, a monkey patch was applied. The patch prevents duplicate creation by using find_or_initialize_by and skipping record creation if the lod_link already exists. This is not a long-term solution and only mitigates the problem in runtime code.
Proposed Solution
A permanent fix should be implemented in the master branch to properly handle the polymorphic nature of lod_links:
Modify the unique index to include linkable_type and linkable_id, ensuring that source and external_id are unique within the scope of a specific linkable record. Migration example:
Adjust Model Validations:
Ensure that the model reflects this uniqueness at the application level:
validates :source, uniqueness: { scope: [:external_id, :linkable_type, :linkable_id] }
Update the wikidata_match_candidates_batch method to handle these polymorphic constraints gracefully and avoid creating unnecessary duplicates:
Action Items
Priority
High: The current issue causes runtime errors in production and relies on a temporary workaround. Proper resolution is needed to ensure database integrity and prevent duplicate errors.
The text was updated successfully, but these errors were encountered: