You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
See #3 for some details on what Matricula hosts and how things are organized as well as terminology.
The following command scrapes all parishes available to Matricula (depending on the optional search/filter parameters):
$ matricula-online-scraper fetch location -e csv
This returns a list with > 8000 entries. Here's the head of the output:
country ,region ,name ,url
Slovenia ,Nadškofija Maribor ,001 Apače ,https://data.matricula-online.eu/en/slovenia/maribor/apace/
Slovenia ,Nadškofija Maribor ,002 Artiče ,https://data.matricula-online.eu/en/slovenia/maribor/artice/
Slovenia ,Nadškofija Maribor ,004 Bele Vode ,https://data.matricula-online.eu/en/slovenia/maribor/bele-vode/
Slovenia ,Nadškofija Maribor ,005 Beltinci ,https://data.matricula-online.eu/en/slovenia/maribor/beltinci/
Slovenia ,Nadškofija Maribor ,006 Bizeljsko ,https://data.matricula-online.eu/en/slovenia/maribor/bizeljsko/
Taking the output of the first command, i.e. the urls, we can pipe it to the second one. This following command then scrapes all available sources of a parish. For 001 Apače:
This returns a list with all available digitized sources of a parish. Here's the head of the output:
name ,url ,accession_number ,date ,register_type ,date_range_start ,date_range_end
Krstna knjiga / Taufbuch ,https://data.matricula-online.eu/en/slovenia/maribor/apace/00001/ , 00001 ,1673-1689 ,Krstna knjiga / Taufbuch ,"Jan. 1, 1673" ,"Dec. 31, 1689"
Krstna knjiga / Taufbuch ,https://data.matricula-online.eu/en/slovenia/maribor/apace/00002/ , 00002 ,1728-1742 ,Krstna knjiga / Taufbuch ,"Jan. 1, 1728" ,"Dec. 31, 1742"
Krstna knjiga / Taufbuch ,https://data.matricula-online.eu/en/slovenia/maribor/apace/00003/ , 00003 ,1742-1760 ,Krstna knjiga / Taufbuch ,"Jan. 1, 1742" ,"Dec. 31, 1760"
Krstna knjiga / Taufbuch ,https://data.matricula-online.eu/en/slovenia/maribor/apace/00004/ , 00004 ,1760-1804 ,Krstna knjiga / Taufbuch ,"Jan. 1, 1760" ,"Dec. 31, 1804"
Krstna knjiga / Taufbuch ,https://data.matricula-online.eu/en/slovenia/maribor/apace/00005/ , 00005 ,1804-1820 ,Krstna knjiga / Taufbuch ,"Jan. 1, 1804" ,"Dec. 31, 1820"
I advocate for changing the names of the subcommands to match them better to the entities of Matricula (= more intuitive):
fetch location becomes list parishes which can be used like list parishes --all or list parishes --filter-place "name"
fetch parish becomes list sources which can be used like list sources --parish … --parish …
a new command for fetching the sources of a parish (feat: support scraping sources #3) will be get source which can be used like get source --url … --url …
Affected Versions
All including the most recent one v0.3.0
This proposes a breaking change!
The text was updated successfully, but these errors were encountered:
Description
See #3 for some details on what Matricula hosts and how things are organized as well as terminology.
The following command scrapes all parishes available to Matricula (depending on the optional search/filter parameters):
$ matricula-online-scraper fetch location -e csv
This returns a list with > 8000 entries. Here's the head of the output:
Taking the output of the first command, i.e. the
url
s, we can pipe it to the second one. This following command then scrapes all available sources of a parish. For 001 Apače:$ matricula-online-scraper fetch parish -e csv --url https://data.matricula-online.eu/en/slovenia/maribor/apace/
This returns a list with all available digitized sources of a parish. Here's the head of the output:
I advocate for changing the names of the subcommands to match them better to the entities of Matricula (= more intuitive):
fetch location
becomeslist parishes
which can be used likelist parishes --all
orlist parishes --filter-place "name"
fetch parish
becomeslist sources
which can be used likelist sources --parish … --parish …
get source
which can be used likeget source --url … --url …
Affected Versions
All including the most recent one v0.3.0
This proposes a breaking change!
The text was updated successfully, but these errors were encountered: