-
Notifications
You must be signed in to change notification settings - Fork 3
VCF to RDF Mapping
Arto Bendiken edited this page Sep 18, 2015
·
22 revisions
The team worked to produce an ontology mapping and supporting software to expose Variant Call Format (VCF) files as linked data, to facilitate offline batch conversion of VCF to various RDF formats as well as to enable online SPARQL querying of VCF files directly.
- Worked on by Raoul, Arto, Kieron, with support by Pjotr and Jerven.
- Based on previous work at BioHackathon 2014 by Raoul and Francesco.
-
https://github.com/ruby-rdf/rdf-vcf (RDF::VCF plugin for RDF.rb)
- Released on RubyGems, installable with
jruby -S gem install rdf-vcf
. (Requires JRuby 9.0+.) - Includes a CLI tool called
vcf2rdf
to transform VCF files into RDF in batch processes. - Implements an RDF.rb reader for VCF and BCF files, supporting also bgzipped and indexed files.
- Released on RubyGems, installable with
-
https://github.com/helios/bio-sparql-otf (OTF)
- Implements a proof-of-concept-quality SPARQL backend for VCF files, based on RDF.rb and RDF::VCF.
- Based on BioHackathon 2014 work: https://github.com/dbcls/bh14/wiki/On-The-Fly-RDF-converter
The CLI utility called vcf2rdf
transforms VCF files into RDF (currently outputting N-Triples):
vcf2rdf Homo_sapiens.1.vcf.gz Homo_sapiens.2.vcf.gz ...
The input files can be either plain text VCF or compressed by bgzip
(as in the above example).
The RDF::VCF
gem can be used like any other RDF.rb reader plugin:
require 'rdf/vcf'
RDF::VCF::Reader.open('Homo_sapiens.vcf.gz') do |reader|
reader.each_statement do |statement|
p statement
end
end