diff --git a/_posts/oaire_graph_2020/oaire_graph_post.Rmd b/_posts/oaire_graph_2020/oaire_graph_post.Rmd index a5e7aae..17ac302 100644 --- a/_posts/oaire_graph_2020/oaire_graph_post.Rmd +++ b/_posts/oaire_graph_2020/oaire_graph_post.Rmd @@ -2,12 +2,12 @@ title: "Accessing and analysing the OpenAIRE Research Graph data dumps" description: | The OpenAIRE Research Graph provides a wide range of metadata about grant-supported research publications. This blog post presents an experimental R package with helpers for splitting, de-compressing and parsing the underlying data dumps. I will demonstrate how to use them by examining the compliance of funded projects with the open access mandate in Horizon 2020. -draft: true author: - name: Najko Jahn url: https://twitter.com/najkoja affiliation: State and University Library Göttingen affiliation_url: https://www.sub.uni-goettingen.de/ +date: "`r Sys.Date()`" output: distill::distill_article bibliography: literature.bib resources: @@ -145,6 +145,8 @@ In this use case, I will illustrate how to make use of the OpenAIRE Research Gra As a start, I load a dataset, which was compiled following the above-described methods using the whole `h2020_results.gz` dump. + + ```{r} oaire_df <- jsonlite::stream_in(file("data/h2020_parsed.json"), verbose = FALSE) %>% diff --git a/_posts/oaire_graph_2020/oaire_graph_post.html b/_posts/oaire_graph_2020/oaire_graph_post.html index f2d75a2..aaed58c 100644 --- a/_posts/oaire_graph_2020/oaire_graph_post.html +++ b/_posts/oaire_graph_2020/oaire_graph_post.html @@ -25,8 +25,8 @@ - - + + @@ -55,7 +55,7 @@ @@ -5092,7 +5092,7 @@
A note on performance: Parsing the whole dump h2020_results
using these parsers took me around 2 hours on my MacBook Pro (Early 2015, 2,9 GHz Intel Core i5, 8GB RAM, 256 SSD). I therefore recommend to back up the resulting data, instead of un-packing the whole dump for each analysis. jsonlite::stream_out()
outputs the data frame to a text-based json-file, where list-columns are preserved per row.
As a start, I load a dataset, which was compiled following the above-described methods using the whole h2020_results.gz
dump.
oaire_df <-
@@ -5437,14 +5440,14 @@
-
-
+
+
Figure 3 shows that many H2020-projects with University of Göttingen participation have an uptake of open access to grant-supported publications that is above the average in the peer group. At the same time, some perform below expectation. Together, this provides a valuable insight into open access compliance at the university-level, especially for research support librarians who are in charge of helping grantees to make their work open access. They can, for instance, point grantees to OpenAIRE-compliant repositoires for self-archiving their works. # How does knowing how projects compare with others funded by the same institutions help to help grantees make their own work open access? To my knowledge the availability of outlets of acceptable quality for publication is highly field specific and I don’t really see how the funder comes into play, unless funders only fund certain fields. # NJ: self-archiving is also possible to comply with the EC’s oa mandate, added a sentence
+Figure 3 shows that many H2020-projects with University of Göttingen participation have an uptake of open access to grant-supported publications that is above the average in the peer group. At the same time, some perform below expectation. Together, this provides a valuable insight into open access compliance at the university-level, especially for research support librarians who are in charge of helping grantees to make their work open access. They can, for instance, point grantees to OpenAIRE-compliant repositoires for self-archiving their works.
Using data from the OpenAIRE Research Graph dumps makes it possible to put the results of a specific data analysis into context. Open access compliance rates of H2020 projects vary. These variations should be considered when reporting compliance rates of specific projects under the same open access mandate.
Although the OpenAIRE Research Graph is a large collection of scholarly data, it is likely that it still does not provide the whole picture. OpenAIRE mainly collects data from open sources. It is still unknown how the OpenAIRE Research Graph compares to well-established toll-access bibliometrics data sources like the Web of Science in terms of coverage and data quality.
diff --git a/docs/index.html b/docs/index.html index 4752492..3f9d294 100644 --- a/docs/index.html +++ b/docs/index.html @@ -21,36 +21,36 @@The OpenAIRE Research Graph provides a wide range of metadata about grant-supported research publications. This blog post presents an experimental R package with helpers for splitting, de-compressing and parsing the underlying data dumps. I will demonstrate how to use them by examining the compliance of funded projects with the open access mandate in Horizon 2020.
+