Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harvester/main - Feature 1&2 DRAFT #1070

Draft
wants to merge 43 commits into
base: master
Choose a base branch
from
Draft
Changes from 1 commit
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
bf676b8
Initial Data Dump model (datacite/1862)
digitaldogsbody Jul 11, 2023
87f9958
Data dump DB migration (datacite/1862)
digitaldogsbody Jul 11, 2023
53c45ca
Data dump model initial test suite (datacite/1862)
digitaldogsbody Jul 11, 2023
20d5bde
Add Data dump model to the RSpec ElasticSearch helper (datacite/1862)
digitaldogsbody Jul 11, 2023
a3af8ac
Updated Schema after database migration (datacite/1862)
digitaldogsbody Jul 11, 2023
5a641af
Initial data dump controller (datacite/1863)
digitaldogsbody Jul 12, 2023
7490c0b
Data dump controller basic test suite (datacite/1863)
digitaldogsbody Jul 12, 2023
b927152
Initial data dump routes (datacite/1864)
digitaldogsbody Jul 12, 2023
ed04430
Data Dump index controller first pass (datacite/1866)
digitaldogsbody Jul 12, 2023
a367beb
Add a factory for the test suites (datacite/1868)
digitaldogsbody Jul 12, 2023
d94a486
Merge pull request #976 from datacite/harvester/1862
digitaldogsbody Jul 12, 2023
28aa571
Merge pull request #979 from datacite/harvester/1863
digitaldogsbody Jul 12, 2023
8ab398a
Merge pull request #980 from datacite/harvester/1866
digitaldogsbody Jul 12, 2023
93048d8
Add a factory for an incomplete data dump (datacite/1868)
digitaldogsbody Jul 13, 2023
d7d3f9d
Update test to create the data_dump from the factory (datacite/1868)
digitaldogsbody Jul 13, 2023
ac6a7d5
Update controller test to create an object and test presence (datacit…
digitaldogsbody Jul 13, 2023
a73009a
Update data dump factory to add missing attributes (datacite/1868)
digitaldogsbody Jul 13, 2023
efcc34d
Remove erroneous comma in validator (datacite/1862)
digitaldogsbody Jul 13, 2023
2899869
Add missing `query_aggregations` property required by Indexable conce…
digitaldogsbody Jul 13, 2023
cc0b91c
Correctly pass the `query` parameter to the ES query function (dataci…
digitaldogsbody Jul 13, 2023
8a0a9c6
Update factory to use Faker for more attributes so it can be used to …
digitaldogsbody Jul 13, 2023
da56e40
Update factory for incomplete objects (datacite/1868)
digitaldogsbody Jul 13, 2023
08c03cf
Initial Data Dump controller requests rspec suite (datacite/1868)
digitaldogsbody Jul 13, 2023
babb2ae
Add pagination tests to Data Dump controller suite (datacite/1868)
digitaldogsbody Jul 13, 2023
83fcfde
Fix bad requests in test suite (datacite/1868)
digitaldogsbody Jul 13, 2023
356eb6e
Fix validate inclusion model tests (datacite/1868)
digitaldogsbody Jul 13, 2023
6caf1d9
Merge pull request #981 from datacite/harvester/1868
digitaldogsbody Jul 13, 2023
538ea2e
Fix accidental conversion of database table schema to latin1
digitaldogsbody Jul 13, 2023
1c478bb
First pass data dump serializer (#1867)
digitaldogsbody Jul 13, 2023
6cd3e76
Merge pull request #983 from datacite/harvester/1867
digitaldogsbody Jul 13, 2023
c28a352
Fix missing brackets in link generation (#1889)
digitaldogsbody Jul 13, 2023
b5502da
Remove invalid parameters to spec requests (#1889)
digitaldogsbody Jul 13, 2023
b07a5fd
Fix links to return max page when the current page is outside of the …
digitaldogsbody Jul 13, 2023
7775fb6
Correct name of tested attribute to account for lowerCamel transforma…
digitaldogsbody Jul 13, 2023
3650925
Correct expected date format to account for serializer behaviour (#1889)
digitaldogsbody Jul 13, 2023
301ffd2
Merge pull request #984 from datacite/harvester/1889
digitaldogsbody Jul 13, 2023
18d5608
Update data dump controller spec to acquire and use a token (datacit…
digitaldogsbody Jul 19, 2023
a45e6b7
Update data dump request spec to test authorization and abilities (d…
digitaldogsbody Jul 19, 2023
7b6d94d
Add ability to permit reading of data dump files (datacite/1865)
digitaldogsbody Jul 19, 2023
cef0eed
Require ability to access data dump controller methods (datacite/1865)
digitaldogsbody Jul 19, 2023
754381c
Merge pull request #987 from datacite/harvester/1865
digitaldogsbody Jul 19, 2023
344f1ee
Add data dump feature 2
digitaldogsbody Aug 10, 2023
d17791f
Merge pull request #996 from datacite/harvester/feature-2
digitaldogsbody Aug 10, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Add data dump feature 2
digitaldogsbody committed Aug 10, 2023

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
commit 344f1eee199d9d8636e41e30d8fb5cbd950b1dcd
17 changes: 15 additions & 2 deletions app/controllers/data_dumps_controller.rb
Original file line number Diff line number Diff line change
@@ -25,9 +25,9 @@ def index
page = page_from_params(params)

response = DataDump.query(
"",
page: page,
sort: sort
sort: sort,
scope: params[:scope]
)

begin
@@ -105,4 +105,17 @@ def show
end
render json: DataDumpSerializer.new(data_dump).serialized_json, status: :ok
end

def latest
authorize! :read, :read_data_dumps
data_dump = DataDump.where(scope: params[:scope], aasm_state: "complete").order(end_date: :desc).first
if data_dump.blank? ||
(
data_dump.aasm_state != "complete"
# TODO: Add conditional check for role here
)
fail ActiveRecord::RecordNotFound
end
render json: DataDumpSerializer.new(data_dump).serialized_json, status: :ok
end
end
41 changes: 41 additions & 0 deletions app/models/data_dump.rb
Original file line number Diff line number Diff line change
@@ -78,4 +78,45 @@ class DataDump < ApplicationRecord
def self.query_aggregations
{}
end

def self.query(options = {})

options[:page] ||= {}
options[:page][:number] ||= 1
options[:page][:size] ||= 25

from = ((options.dig(:page, :number) || 1) - 1) * (options.dig(:page, :size) || 25)
sort = options[:sort]

filter = []
if options[:scope].present?
filter << { term: { scope: options[:scope].downcase } }
end

es_query = {bool: {filter: filter}}

if options.fetch(:page, {}).key?(:cursor)
__elasticsearch__.search(
{
size: options.dig(:page, :size),
search_after: search_after,
sort: sort,
query: es_query,
track_total_hits: true,
}.compact,
)
else
__elasticsearch__.search(
{
size: options.dig(:page, :size),
from: from,
sort: sort,
query: es_query,
track_total_hits: true,
}.compact,
)
end

end

end
4 changes: 3 additions & 1 deletion config/routes.rb
Original file line number Diff line number Diff line change
@@ -230,7 +230,9 @@
resources :repository_prefixes, path: "repository-prefixes"
resources :resource_types, path: "resource-types", only: %i[show index]

resources :data_dumps, constraints: { id: /.+/ }, only: %i[show index]
get "/data_dumps/:scope/latest", to: "data_dumps#latest", constraints: { scope: /(metadata|link)/ }
get "/data_dumps/:scope", to: "data_dumps#index", constraints: { scope: /(metadata|link)/ }
resources :data_dumps, constraints: { id: /[A-Za-z0-9_-]+/ }, only: %i[show index]

# custom routes for maintenance tasks
post ":username", to: "datacite_dois#show", as: :user
172 changes: 172 additions & 0 deletions spec/requests/data_dumps_spec.rb
Original file line number Diff line number Diff line change
@@ -190,4 +190,176 @@
end
end

describe "GET /data_dumps/:id" do
context "with valid authorization" do
context "when the record exists" do
it "returns the record" do
get "/data_dumps/#{data_dump.uid}", nil, headers

expect(last_response.status).to eq(200)
expect(json.dig("data", "attributes", "description")).to eq("Test Metadata Data Dump Factory creation")
expect(json.dig("data", "attributes", "startDate")).to eq(data_dump.start_date.rfc3339(3))
end
end

context "when the record does not exist" do
it "returns status code 404" do
get "/data_dumps/invalid_id", nil, headers

expect(last_response.status).to eq(404)
expect(json["errors"].first).to eq("status" => "404", "title" => "The resource you are looking for doesn't exist.")
end
end
end

context "without authorization" do
context "when the record exists" do
it "returns access denied" do
get "/data_dumps/#{data_dump.uid}"
expect(last_response.status).to eq(401)
end
end

context "when the record does not exist" do
it "returns access denied" do
get "/data_dumps/invalid_id"
expect(last_response.status).to eq(401)
end
end
end

context "with bad authorization" do
context "when the record exists" do
it "returns access denied" do
get "/data_dumps/#{data_dump.uid}", nil, bad_headers
expect(last_response.status).to eq(401)
end
end

context "when the record does not exist" do
it "returns access denied" do
get "/data_dumps/invalid_id", nil, bad_headers
expect(last_response.status).to eq(401)
end
end
end

context "with insufficient permission" do
context "when the record exists" do
it "returns access denied" do
get "/data_dumps/#{data_dump.uid}", nil, user_headers
expect(last_response.status).to eq(403)
end
end

context "when the record does not exist" do
it "returns access denied" do
get "/data_dumps/invalid_id", nil, user_headers
expect(last_response.status).to eq(403)
end
end
end
end

describe "GET /data_dumps/:scope", elasticsearch: true do
let!(:data_dumps) { create_list(:data_dump, 10) }
let!(:link_dumps) { create_list(:data_dump, 10, {scope: "link"}) }

before do
DataDump.import
sleep 1
end

context "with valid authorization" do
it "returns metadata data dumps" do
get "/data_dumps/metadata", nil, headers

expect(last_response.status).to eq(200)
expect(json["data"].size).to eq(10)
expect(json.dig("meta", "total")).to eq(10)
end

it "returns link data dumps" do
get "/data_dumps/link", nil, headers

expect(last_response.status).to eq(200)
expect(json["data"].size).to eq(10)
expect(json.dig("meta", "total")).to eq(10)
end
end

context "without authorization" do
it "returns access denied" do
get "/data_dumps/metadata"
expect(last_response.status).to eq(401)
end
end

context "with bad authorization" do
it "returns access denied" do
get "/data_dumps/metadata", nil, bad_headers
expect(last_response.status).to eq(401)
end
end

context "with insufficient permission" do
it "returns access denied" do
get "/data_dumps/metadata", nil, user_headers
expect(last_response.status).to eq(403)
end
end
end

describe "GET /data_dumps/:scope/latest", elasticsearch: true do
let!(:data_dumps) { create_list(:data_dump, 10) }
let!(:link_dumps) { create_list(:data_dump, 10, {scope: "link"}) }
let!(:latest_data) { create(:data_dump, uid: "latest_data", end_date:"2023-12-31")}
let!(:latest_link) { create(:data_dump, uid: "latest_link", scope: "link", end_date:"2023-12-31")}
before do
DataDump.import
sleep 1
end

context "with valid authorization" do
it "returns latest metadata data dump" do
get "/data_dumps/metadata/latest", nil, headers

expect(last_response.status).to eq(200)
expect(json.dig("data", "id")).to eq("latest_data")
expect(json.dig("data", "attributes", "endDate")).to eq("2023-12-31T00:00:00.000Z")
expect(json.dig("data", "attributes", "startDate")).to eq(latest_data.start_date.rfc3339(3))
end

it "returns latest link data dump" do
get "/data_dumps/link/latest", nil, headers

expect(last_response.status).to eq(200)
expect(json.dig("data", "id")).to eq("latest_link")
expect(json.dig("data", "attributes", "endDate")).to eq("2023-12-31T00:00:00.000Z")
expect(json.dig("data", "attributes", "startDate")).to eq(latest_link.start_date.rfc3339(3))
end
end

context "without authorization" do
it "returns access denied" do
get "/data_dumps/metadata/latest"
expect(last_response.status).to eq(401)
end
end

context "with bad authorization" do
it "returns access denied" do
get "/data_dumps/metadata/latest", nil, bad_headers
expect(last_response.status).to eq(401)
end
end

context "with insufficient permission" do
it "returns access denied" do
get "/data_dumps/metadata/latest", nil, user_headers
expect(last_response.status).to eq(403)
end
end
end

end