You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Resolving trailing SRL annotations in conll sometimes produces exceptions due to unreadable TMP nodes.
the CoNLL2RDF.java class (in conll2ttl func) produces temporary nodes "TMP_SRL{ID}" that are fixed later on. However, if they can't be fixed, unusable ttl gets putout as "_TMP_SRL_2" for example isn't a valid node name.
A potential (hacky?) fix I found is to change "_TMP_SRL" to ":_TMP_SRL" at the following two places:
This makes them acceptable node identifiers so one could fix them later in sparql queries?
Here a (somewhat) minimal example where this happens, I couldn't find a fix for the annotation so I can't sensibly minimize it. The call was cat broken.conll | ./run.sh CoNLLStreamExtractor http://example.com ID WORD FEATS HEAD EDGE SRL SRL-ARGs
1 Although _ 6 mark _ _ _ ARGM-ADV
2 data _ 6 nsubj _ _ _ ARGM-ADV
3 in _ 2 prep _ _ _ ARGM-ADV
4 this _ 5 det _ _ _ ARGM-ADV
5 project _ 3 pobj _ _ _ ARGM-ADV
6 are _ 24 advcl _ _ _ ARGM-ADV
7 de _ 8 dep _ ARG0 ARGM-MNR ARGM-ADV
8 - _ 9 dep - V _ ARGM-ADV
9 identified _ 12 amod identified _ V ARGM-ADV
10 , _ 12 punct _ _ _ _
11 certain _ 12 amod _ _ _ _
12 information _ 24 nsubjpass _ _ _ _
13 such _ 14 amod _ _ _ _
14 as _ 12 prep _ _ _ _
15 the _ 16 det _ _ _ _
16 number _ 14 pobj _ _ _ _
17 of _ 16 prep _ _ _ _
18 ED _ 19 compound _ _ _ _
19 visits _ 17 pobj _ _ _ _
20 by _ 19 prep _ _ _ _
21 zip _ 22 compound _ _ _ _
22 code _ 20 pobj _ _ _ _
23 were _ 24 auxpass _ _ _ _
24 considered _ 0 ROOT considered _ _ V
25 proprietary _ 26 amod _ _ _ ARG1
26 information _ 24 oprd _ _ _ ARG1
27 by _ 26 prep _ _ _ ARG0
28 some _ 30 det _ _ _ ARG0
29 health _ 30 compound _ _ _ ARG0
30 systems _ 27 pobj _ _ _ ARG0
31 . _ 24 punct _ _ _ _
And full error:
18:41:58 INFO CoNLLStreamExtractor :: synopsis: CoNLLStreamExtractor baseURI FIELD1[.. FIELDn] [-u SPARQL_UPDATE1..m] [-s SPARQL_SELECT]
baseURI CoNLL base URI, cf. CoNLL2RDF
FIELDi CoNLL field label, cf. CoNLL2RDF
SPARQL_UPDATE SPARQL UPDATE (DELETE/INSERT) query, either literally or its location (file/uri)
can be followed by an optional integer in {}-parentheses = number of repetitions
The SPARQL_UPDATE parameter is DEPRECATED - please use CoNLLRDFUpdater instead!
SPARQL_SELECT SPARQL SELECT statement to produce TSV output
reads CoNLL from stdin, splits sentences, creates CoNLL RDF, applies SPARQL queries
18:41:58 INFO CoNLLStreamExtractor :: running CoNLLStreamExtractor
18:41:58 INFO CoNLLStreamExtractor :: baseURI: http://example.com
18:41:58 INFO CoNLLStreamExtractor :: CoNLL columns: [ID, WORD, FEATS, HEAD, EDGE, SRL, SRL-ARGs]
18:41:58 INFO CoNLLStreamExtractor :: SPARQL update: []
18:41:58 INFO CoNLLStreamExtractor :: SPARQL select: null
18:41:58 INFO CoNLLStreamExtractor :: read SPARQL ..
18:41:58 INFO CoNLLStreamExtractor :: .. ok
18:41:58 INFO CoNLLStreamExtractor :: process input ..
18:41:59 ERROR riot :: [line: 46, col: 1 ] Out of place: [UNDERSCORE]
org.apache.jena.riot.RiotException: [line: 46, col: 1 ] Out of place: [UNDERSCORE]
at org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:147)
at org.apache.jena.riot.lang.LangEngine.raiseException(LangEngine.java:148)
at org.apache.jena.riot.lang.LangEngine.exceptionDirect(LangEngine.java:143)
at org.apache.jena.riot.lang.LangEngine.exception(LangEngine.java:137)
at org.apache.jena.riot.lang.LangTurtleBase.triplesSameSubject(LangTurtleBase.java:239)
at org.apache.jena.riot.lang.LangTurtle.oneTopLevelElement(LangTurtle.java:46)
at org.apache.jena.riot.lang.LangTurtleBase.runParser(LangTurtleBase.java:91)
at org.apache.jena.riot.lang.LangBase.parse(LangBase.java:41)
at org.apache.jena.riot.RDFParserRegistry$ReaderRIOTLang.read(RDFParserRegistry.java:206)
at org.apache.jena.riot.RDFParser.read(RDFParser.java:338)
at org.apache.jena.riot.RDFParser.parseNotUri(RDFParser.java:324)
at org.apache.jena.riot.RDFParser.parse(RDFParser.java:273)
at org.apache.jena.riot.RDFParserBuilder.parse(RDFParserBuilder.java:498)
at org.apache.jena.riot.RDFDataMgr.parseFromReader(RDFDataMgr.java:880)
at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:298)
at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:283)
at org.apache.jena.riot.adapters.RDFReaderRIOT.read(RDFReaderRIOT.java:62)
at org.apache.jena.rdf.model.impl.ModelCom.read(ModelCom.java:298)
at org.acoli.conll.rdf.Format2RDF.conll2model(Format2RDF.java:235)
at org.acoli.conll.rdf.CoNLL2RDF.conll2model(CoNLL2RDF.java:39)
at org.acoli.conll.rdf.CoNLLStreamExtractor.processSentenceStream(CoNLLStreamExtractor.java:139)
at org.acoli.conll.rdf.CoNLLStreamExtractor.main(CoNLLStreamExtractor.java:405)
18:41:59 INFO Format2RDF :: while processing the following input:
<code>
PREFIX nif: <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#>
PREFIX conll: <http://ufal.mff.cuni.cz/conll2009-st/task-description.html#>
PREFIX x: <http://purl.org/acoli/conll-rdf/xml#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX terms: <http://purl.org/acoli/open-ie/>
PREFIX powla: <http://purl.org/powla/powla.owl#>
PREFIX : <http://example.com>
:s1_0 a nif:Sentence.
:s1_1 a nif:Word; conll:ID "1"; conll:WORD "Although"; conll:HEAD :s1_6; conll:EDGE "mark"; nif:nextWord :s1_2.
:s1_2 a nif:Word; conll:ID "2"; conll:WORD "data"; conll:HEAD :s1_6; conll:EDGE "nsubj"; nif:nextWord :s1_3.
:s1_3 a nif:Word; conll:ID "3"; conll:WORD "in"; conll:HEAD :s1_2; conll:EDGE "prep"; nif:nextWord :s1_4.
:s1_4 a nif:Word; conll:ID "4"; conll:WORD "this"; conll:HEAD :s1_5; conll:EDGE "det"; nif:nextWord :s1_5.
:s1_5 a nif:Word; conll:ID "5"; conll:WORD "project"; conll:HEAD :s1_3; conll:EDGE "pobj"; nif:nextWord :s1_6.
:s1_6 a nif:Word; conll:ID "6"; conll:WORD "are"; conll:HEAD :s1_24; conll:EDGE "advcl"; nif:nextWord :s1_7.
:s1_7 a nif:Word; conll:ID "7"; conll:WORD "de"; conll:HEAD :s1_8; conll:EDGE "dep"; nif:nextWord :s1_8.
:s1_8 a nif:Word; conll:ID "8"; conll:HEAD :s1_9; conll:EDGE "dep"; nif:nextWord :s1_9.
:s1_9 a nif:Word; conll:ID "9"; conll:WORD "identified"; conll:HEAD :s1_12; conll:EDGE "amod"; conll:SRL "identified"; nif:nextWord :s1_10.
:s1_10 a nif:Word; conll:ID "10"; conll:WORD ","; conll:HEAD :s1_12; conll:EDGE "punct"; nif:nextWord :s1_11.
:s1_11 a nif:Word; conll:ID "11"; conll:WORD "certain"; conll:HEAD :s1_12; conll:EDGE "amod"; nif:nextWord :s1_12.
:s1_12 a nif:Word; conll:ID "12"; conll:WORD "information"; conll:HEAD :s1_24; conll:EDGE "nsubjpass"; nif:nextWord :s1_13.
:s1_13 a nif:Word; conll:ID "13"; conll:WORD "such"; conll:HEAD :s1_14; conll:EDGE "amod"; nif:nextWord :s1_14.
:s1_14 a nif:Word; conll:ID "14"; conll:WORD "as"; conll:HEAD :s1_12; conll:EDGE "prep"; nif:nextWord :s1_15.
:s1_15 a nif:Word; conll:ID "15"; conll:WORD "the"; conll:HEAD :s1_16; conll:EDGE "det"; nif:nextWord :s1_16.
:s1_16 a nif:Word; conll:ID "16"; conll:WORD "number"; conll:HEAD :s1_14; conll:EDGE "pobj"; nif:nextWord :s1_17.
:s1_17 a nif:Word; conll:ID "17"; conll:WORD "of"; conll:HEAD :s1_16; conll:EDGE "prep"; nif:nextWord :s1_18.
:s1_18 a nif:Word; conll:ID "18"; conll:WORD "ED"; conll:HEAD :s1_19; conll:EDGE "compound"; nif:nextWord :s1_19.
:s1_19 a nif:Word; conll:ID "19"; conll:WORD "visits"; conll:HEAD :s1_17; conll:EDGE "pobj"; nif:nextWord :s1_20.
:s1_20 a nif:Word; conll:ID "20"; conll:WORD "by"; conll:HEAD :s1_19; conll:EDGE "prep"; nif:nextWord :s1_21.
:s1_21 a nif:Word; conll:ID "21"; conll:WORD "zip"; conll:HEAD :s1_22; conll:EDGE "compound"; nif:nextWord :s1_22.
:s1_22 a nif:Word; conll:ID "22"; conll:WORD "code"; conll:HEAD :s1_20; conll:EDGE "pobj"; nif:nextWord :s1_23.
:s1_23 a nif:Word; conll:ID "23"; conll:WORD "were"; conll:HEAD :s1_24; conll:EDGE "auxpass"; nif:nextWord :s1_24.
:s1_24 a nif:Word; conll:ID "24"; conll:WORD "considered"; conll:HEAD :s1_0; conll:EDGE "ROOT"; conll:SRL "considered"; nif:nextWord :s1_25.
:s1_25 a nif:Word; conll:ID "25"; conll:WORD "proprietary"; conll:HEAD :s1_26; conll:EDGE "amod"; nif:nextWord :s1_26.
:s1_26 a nif:Word; conll:ID "26"; conll:WORD "information"; conll:HEAD :s1_24; conll:EDGE "oprd"; nif:nextWord :s1_27.
:s1_27 a nif:Word; conll:ID "27"; conll:WORD "by"; conll:HEAD :s1_26; conll:EDGE "prep"; nif:nextWord :s1_28.
:s1_28 a nif:Word; conll:ID "28"; conll:WORD "some"; conll:HEAD :s1_30; conll:EDGE "det"; nif:nextWord :s1_29.
:s1_29 a nif:Word; conll:ID "29"; conll:WORD "health"; conll:HEAD :s1_30; conll:EDGE "compound"; nif:nextWord :s1_30.
:s1_30 a nif:Word; conll:ID "30"; conll:WORD "systems"; conll:HEAD :s1_27; conll:EDGE "pobj"; nif:nextWord :s1_31.
:s1_31 a nif:Word; conll:ID "31"; conll:WORD "."; conll:HEAD :s1_24; conll:EDGE "punct".
:s1_9 conll:ARG0 :s1_7.
:s1_9 conll:V :s1_8.
:s1_24 conll:ARGM-MNR :s1_7.
:s1_24 conll:V :s1_9.
_TMP_SRL_2 conll:ARG0 :s1_27.
_TMP_SRL_2 conll:ARG0 :s1_28.
_TMP_SRL_2 conll:ARG0 :s1_29.
_TMP_SRL_2 conll:ARG0 :s1_30.
_TMP_SRL_2 conll:ARG1 :s1_25.
_TMP_SRL_2 conll:ARG1 :s1_26.
_TMP_SRL_2 conll:ARGM-ADV :s1_1.
_TMP_SRL_2 conll:ARGM-ADV :s1_2.
_TMP_SRL_2 conll:ARGM-ADV :s1_3.
_TMP_SRL_2 conll:ARGM-ADV :s1_4.
_TMP_SRL_2 conll:ARGM-ADV :s1_5.
_TMP_SRL_2 conll:ARGM-ADV :s1_6.
_TMP_SRL_2 conll:ARGM-ADV :s1_7.
_TMP_SRL_2 conll:ARGM-ADV :s1_8.
_TMP_SRL_2 conll:ARGM-ADV :s1_9.
_TMP_SRL_2 conll:V :s1_24.
</code>
The text was updated successfully, but these errors were encountered:
Resolving trailing SRL annotations in conll sometimes produces exceptions due to unreadable TMP nodes.
the CoNLL2RDF.java class (in conll2ttl func) produces temporary nodes "TMP_SRL{ID}" that are fixed later on. However, if they can't be fixed, unusable ttl gets putout as "_TMP_SRL_2" for example isn't a valid node name.
A potential (hacky?) fix I found is to change "_TMP_SRL" to ":_TMP_SRL" at the following two places:
conll-rdf/src/org/acoli/conll/rdf/CoNLL2RDF.java
Lines 90 to 94 in 82b9dc4
conll-rdf/src/org/acoli/conll/rdf/CoNLL2RDF.java
Lines 145 to 148 in 82b9dc4
This makes them acceptable node identifiers so one could fix them later in sparql queries?
Here a (somewhat) minimal example where this happens, I couldn't find a fix for the annotation so I can't sensibly minimize it. The call was
cat broken.conll | ./run.sh CoNLLStreamExtractor http://example.com ID WORD FEATS HEAD EDGE SRL SRL-ARGs
And full error:
The text was updated successfully, but these errors were encountered: