-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
map the annotation syntax to rdfs:states
#128
Comments
Standing ovation My single quibble is with the word ... which I therefore suggest. (This might be best accompanied by changes from About the conclusion, however... I think it would be in keeping with RDF-Classic to allow loading of the "widowed" or "orphaned" line(s). Cleanup could be done dynamically at load time, or later as some kind of batch job; in both cases, probably best with user input to say how to resolve the detected issues (like, " Junk results are to be expected from junk data. Discovery and repair of junk data before junk results in live deployment is to be worked toward. Repairing junk data delivers good (or at least, tolerable) data, resulting in good (or at least, tolerable) query/analysis results, approaching best. |
Both "Mapping to N-triples-star" examples are incorrect in their final compact forms. While this:
does indeed expand to:
it does not compact to:
but to:
If we were to introduce a relationship to the effect that the reifier implies the truth of the triple(s) it refies (or some of them), annotation syntax does not work to simply reify the triple so asserted; it is locked to such a meaning. This is a very important thing to recognize about this additional difference. That means that you cannot use it to reference reifiers orthogonal to assertion using the annotation syntax. As you exemplify, you'd have to write:
Since this would be different to:
I understand that this is what you want; I just want to make it clear. This would go for all uses: be it qualification, "neutral" provenance, marginalia, etc. I am not principally opposed to this, if it is clear and intuitive in common practice. I personally think notions like propositional attitude and justified beliefs are interesting but tricky, and thus I've leaned towards having these aspects in the specific domain modelling, and just have "reifies" in the basics. But if this distinction is common belief (and conversely that the lack of it would be harmful for comprehension), and given that the various practical uses prove to differentiate correctly (and not just reference the triple reified), then that is a case for introducing it at this basic level. If the distinction is added, we would have something like "implicators" alongside reifiers (or as a subset thereof). In that case, I'd suggest To properly model something Bob believes that is not considered true/known a graph, consider:
That suggests that the belief doesn't imply the statement in this graph. I think it is OK (in this graph, this reifier is not an "implicator", or thuth-maker, if you will). It's just one of these differences we need to take into consideration. (Aside: I'd use e.g. (That is part of what I was exploring with the Anne Bonny example, where e.g. |
@TallTed Thank you for the flowers ;-) About property naming:
YMMV ;-) but let's not get into a naming discussion too deep right now. I know it's tempting, but like syntax it should be fixed at the end. Right now any name is good enough that helps discussion the topic at hand. About the Anne Bonny example: ##
# These are the primary facts asserted by the publisher of this graph.
<Anne_Bonny> a :Person ;
:name "Anne Bonny" ;
:parent <William_Cormac> ~ <#005> .
<#001> a :Circumstance ;
:startDate "1716"^^:EDTF ~ <#005> .
##
# These are various documented sources of claims; asserted or just cited.
# Again, according to the publisher of this graph.
<#005> a :Reference ;
:source <http://www.encyclopedia.com/doc/1G2-3446400036.html> ~ <#002> ;
:date "2024-08-14T12:48:07Z"^^:DateTime .
##
# These are spurious claims made in the referenced sources.
# They are not considered true by the publisher of this graph.
<< <Anne_Bonny> :familyName "Brennan" ~ <#005> >> . Mapping that to an <#005>
rdfs:states <<( <Anne_Bonny> :parent <William_Cormac> )>> ,
<<( <#001> :startDate "1716"^^:EDTF )>> ;
rdf:reifies <<( <Anne_Bonny> :familyName "Brennan" )>> .
# omitting all the stated triples for brevity Or do I miss something? |
rdfs:states
Thanks a lot for this @rat10. I understand your point much better now. Under this proposal, at least how I understood it, I am worried that a proper use of RDF 1.2 can result in "junk data". I find orphaned assertions (@TallTed ) / dangling stated reifications (@rat10) problematic; I don't find the solutions convincing at this point. I have a feeling that people are going to use N-Triples syntax anyway (and why shouldn't they? it's a lot simpler); even if they don't, (poorly coded) applications can make mistakes or (bad) choices in N-Triples, e.g., leave out the assertions to reduce the dataset size. My main problem - what if your current dataset is (valid use of RDF 1.2):
and then you add:
Does it mean that, all of a sudden, there is a mistake (missing Or, similarly, when you have vanilla RDF:
But then add:
Does it similarly mean there is now a mistake (missing I remember Pat Hayes saying that each RDF triple should be able to stand on its own; this no longer seems to be the case as an rdf:states must be accompanied by the assertion. But, IIUC, your point is that there is a similar situation now:
That adding:
Now means that Alice is all of a sudden talking about a fact, and we no longer know which of the annotations referred to |
In my opinion, RDF is supposed to be a very simple formalism, with only the bare minimum of machinery needed to support minimalist representation of truths. (Yes, this isn't exactly true for RDF but, again in my opinion, this is what any augmentation of RDF should be striving for.) But being simple can require wordy constructions for what can be stated concisely in more-complex languages. To alleviate this problem, some surface syntaxes for RDF provide shorthands for common wordy constructions. In my opinion, the proposal to have both rdf:reifies and rdf:states in RDF violates both these minimalist ideals. First, the two properties appear to differ only in some notion related to propositional attitudes, i.e., the stance of the constructs vis-a-vis whether the truth of the quoted statment is supported by the construct. Propositional attitudes are a very complex notion, and thus do not belong in RDF. Second, propositional attitudes can be just as well be done in some extension of RDF or even done as user vocabulary, and thus again do not belong in RDF. So I am against having both rdf:reifies and rdf:states in RDF, as this sort of capability goes against the minimalist nature of RDF. PS: Here is how one could model a reifier supporting a statement: _:r rdf:reifies <<( :s :p :o )>> . More-complex relationships are possible and would be modelled in more-complex ways, perhaps requiring a second "stand-off". |
No. A statement that is not annotated is just a regular statement (same answer to the following "similar" question).
That is the problem with meta-modelling, and why RDF 1.0 resorted to reification.
Well, in my proposal it's defined as a syntactic macro, and where that macro fails to take hold (e.g. when users mess around with N-triples-star) instead of must I propose MAY or SHOULD (and @TallTed provides some helpful detail to that).
Indeed similar insofar as adding a statement may change how another statement is interpreted - which of course depends on if your intuition is sufficiently primed by the RDF semantics, or if you rather follow lay persons intuitions and what the syntactic sugar suggests. Yes, that is my issue. In a way one could call this non-monotonic, as assumptions are evoked by the syntactic sugar, but then not backed up in the N-triples based machinery of triple stores and (especially streaming) data exchange. |
Minimalistic yes, but not to the point of not being useful or even misleading. It is a design discussion what is considered the most minimalistic but useful design, and we're having it here. I argue how RDF-star without
But the syntactic sugar should not deviate from what is stored as bare triples. As I laid out above, it currently does. Instead of introducing an
Propositional attitudes are indeed a wide field, but the basic distinction between a statement considered
IMO modelling is actually not necessarily the worst part, but querying is. If one always has to check for extra annotation w.r.t. to the "stated-ness" of the resource an annotation is meant to refer to, then we put a pretty severe burden on query authors and query engines (i.e. one more line in the query, one more join the engine has to manage), and we would still need to provide some vocabulary to this effect. See also the discussion of work arounds above. |
That wasn't the issue; it becomes "annotated" once added to the dataset, since there are now 2 annotations about that triple. But now an
Not to be argumentative, but it seems to be that the answer is actually yes, those will be a mistakes :-) You say that MAY/SHOULD instead of MUST should be used, so they seem less like mistakes; and your refer to @TallTed re dealing with junk data and cleanup, which you don't need that if there aren't mistakes to be cleaned up (not meant to be snarky, just saying). If you require (may/should/must aside), in an RDF 1.2 dataset, that an |
Yes, but:
(also not meant to be snarky, just to point it out) |
It is no surprise that I completely disagree with both of these claims. I have seen nothing that requires
I dispute this claim. I do not see anything in the "annotation" shorthand that requires using a different property from the other shorthand. If users want to augment the "annotation" shorthand with information about propositional attitudes then they are free to do so. (Of course, RDF and RDFS miss much about propositional attitudes so propositional attitude triples do not expand RDF or RDFS to actually include propositional attitidues.)
A propositional attitude is a relationship between two things, not "a statement considered
I would also like to not have to use complex constructs in queries, but that's not how RDF and SPARQL work. The right way to capture any part of any sort of propositional attitude is in a semantic extension of RDF, not in RDF itself. |
The annotations in your example refer to a reifier that
Okay, if you insist ;-) I wouldn't rule out that people find creative uses for a three-valued system where it makes a difference if an annotation claims to annotate a statement that it considers true or if the graph actually contains the statement. Also, MUST is very strong word. But in general the idea is indeed that an "rdfs:stated" reification that is not contained as a standard triple in the graph points to problem in the data. |
But now there is no
Oof, we got there! :-) How about the inverse, i.e., a standard triple that is not accompanied by an rdfs:stated reification? |
Yes, but that was described in my proposal above from the start as a "dangling" stated annotation in section "Entailment ???".
That is not a problem. The |
You may dispute this claim to your heart's content, but you might consider backing that up with some counter arguments. I illustrated the problems above, see sections "Mapping" ff.
And what exactly is then the disagreement?
Oh, the hyperbole again: "far outside". Again: provide arguments and analyses, react to what I described in a lot of detail. Don't just put up strawmans like "propositional attitude" and then ascribe them to me. |
I was trying to argue that a tradeoff is to be made, but that it is well justified. I expect the latter case - where one queries for annotations on asserted AND on unasserted triples of some type - to be the exception. The norm should be to search for annotations on asserted statements, because most of the time RDF is used to describe facts. Such queries would be straightforward: just use zeh annotation syntax, which internally is mapped to |
This was (extensively) discussed during today's Semantics TF's meeting |
My personal take-away of this conversation is that, as @franconi pointed out, we should probably separate this in two decisions:
|
As I (and others) explained during today's Semantics TF's meeting, this is an over-interpretation of what's in the graph.
Again, this is an over-interpretation. Nothing in this graph tells me that the author of the graph endores TheoryThree as a whole! I can only tell that the author endorses |
@pchampin It obviously is an over-interpretation according to the defined semantics, but that is not my point. However, it is a natural interpretation according to at least my intuition of how a reader not versed in RDF will interpret syntactic sugar for annotated statements and annotated unasserted reifications and Turtle-star. That is the point. Therefore also @franconi's proposal to separate the two issues is a double edged sword to me: it would be an important step forward on the N-triples level, and in that respect I'm glad that it finds some support. W.r.t. to the critique of the |
I completely agree with Pierre-Antoine. I further find no current merit whatsoever in arguments based on over-interpretation of what RDF is. RDF is a very low-level formalism with impoverished syntax, impoverished semantics, and impoverished consequences. I don't like this aspect of RDF but that is were we are. I deeply sympathize with the argument that the impoverished nature of RDF is a problem. But in my view what would be worse than the current situation is a situation where parts of RDF are impoverished and other parts are not. |
@pfps Following that argument it would probably be consequent to drop the syntactic sugar for reified triple terms entirely, because what is worse than syntactic sugar that is not backed by the impoverished representation as bare triples? |
@rat10 I don't see how that follows. Syntactic sugar is backed by its expansion because that's all there is to syntactic sugar. |
@pfps I discussed different mappings in the issue description and how what the syntactic sugar seems to suggest changes when round-tripping to the database, or N-triples for that matter. A N-triples-star serialization that maps both syntactic variants to We claim to support "unasserted statements", but what we actually support are merely "not yet asserted statements, without any guarantees on how long that state will hold". We claim to support "statements about statements", but because of the impoverished semantics of reification what we actually support are "statements about things that look like the statements one might want to annotate". The proposed |
As far as I can tell there is nothing in the syntactic shorthands that makes any claims in regard to supporting "unasserted statements". There is certainly nothing there that makes any claims to supporting "not yet asserted statements". The evolution of RDF graphs is something completly outside the reach of RDF or RDFS. Both RDF and RDFS are monotonic but that is not about evolution of RDF graphs. Syntactic shorthands are not designed to be round-trippable. Because shorthands create a different syntactic form that expands to an underlying form that can be written without using the shorthand there is no way to round trip them. It is sometimes possible to define a canonical form that includes shorthands, but some of the shorthands in Turtle 1.1 do not admit a canonical form without an artificial order imposed on nodes. In sum, I don't see a problem that needs to be solved here. |
I think the point is that information is lost when Turtle syntactic sugar is translated to N-Triples. IMO the round tripping aspect may be a bit of a red herring.
Translates to
We now no longer know which reifier was associated with the annotation syntax This information may indicate the author's intended meaning for When using N-Triples directly, you can use another reification property that aligns with your intended meaning; For that reason, I currently think it's a good idea to translate the annotation syntax to |
Using syntactic shorthands inevitability means information loss of some sort. I don't view the information loss here as in any way a problem. There is lots of information loss going from Turtle to an RDF graph, not just from syntactic shorthands. Blank node identifiers are lost, for example. In RDF this information is considered irrelevant. I view the information loss using the annotation syntax as similarly irrelevant. |
I definitely agree. Re "red herring": the round-tripping example serves to illustrate the problem - of course, if Turtle-star with syntactic sugar is the interface to RDF-star, then it is not only an illustration of a problem but a real problem. Re dangling |
@niklasl in the last Semantics TF meeting, discussing Some argue that adding a property like There is also another aspect to take into account, which is rather covering the opposite perspective: in the current design the connection between a statement and "its" annotation is very brittle: the reifier describes the occurrence of a statement of a certain type. It makes no assumption if such a statement actually occurs in the graph. And, assuming that such a statement indeed does occur in the graph, there is likewise no guarantee that the annotation refers to it: the statement in the graph represents the type itself, not some occurrence of stating it, and might be the result of another act of stating. There is no guarantee that the reification describes an occurrence that actually happened. The presence of a statement of that type in the graph can never be considered more than incidental. That impoverished is the semantics of reification, and anything more definitive lies completely in the eye of the beholder. Consequently, a little bit more of definitiveness would definitely be welcome. IMO the current design is viable only if one doesn't really expect competing viewpoints, not even in the documentation of not endorsed statements - and that to me seems just very much not like the semantic web. The current design is too tedious and/or too un-expressive to be a general solution, and therefore too unreliable to be viable outside of well-controlled environments. |
I believe that the following is not correct.
From https://www.w3.org/TR/rdf11-mt/#reification
|
The origin of annotation syntax: This shorthand form can express simple usage simply: Example:
|
@pfps wrote
That paragraph about reification in the RDF 1.1 Semantics document starts with:
The RDF (1.0) Semantics specification from 2004 is even more explicit, saying that:
That the reification doesn't entail the triple can IMHO be interpreted as merely a side effect, a consequence of entailment not being available at the base level of RDF. What I call a hack is using that restriction to the effect of speaking about statements without asserting them, instead of to describe other RDF triples (where IIUC "triple" refers to a statement asserted in the graph). |
I agree about the statement in the first sentence, and I gladly use my freedom to not have a problem with it. dbr:Torn_\(Ednaswap_song\) a dbo:Work ;
eg;recorded_by dbr:Lis_Sørensen ;
eg:recorded_in "1995"^^xsd:gYear.
dbr::Torn_\(Ednaswap_song\)
eq:recorded_by dbr:Natalie_Imbruglia ;
eg:recorded_in "1997"^^xsd:gYear. The grouping of triples does not round-trip, and some people might complain about that, as it mixes the information about who recorded when. This is not a bug in RDF or in Turtle, though, this is bad modelling that people should avoid. Similarly, people must not rely on the specific kind of bracket they use ( PS: people must not either rely on the difference between using square brackets vs. bnode labels ( |
It was actually the other direction that I was concerned about in the beginning: an annotation referring to an unasserted statement becomes an annotation referring to a fact if the triple is added to the graph. That may run counter the intention of the initial annotation and may lead to false conclusions. Let me give another example: << :Alice :buys :Car >>
:type :Cabriolet ;
:purpose :FunRiding .
:Alice :buys :Car {|
:type :Sedan ;
:purpose :Commuting
|} . Maybe Alice planned but never pulled through with the purchase of a cabriolet. Maybe it was just unconfirmed hearsay. In any case, we can't assume she did, but we know that she did buy a sedan for commuting to work. An author familiar only with Turtle-star but not the more intricate mapping to N-triples-star would be excused to think that s/he modeled this correctly in the Turtle-star above. Mapping to N-triples-star according to the current design is straightforward (mapping the annotation syntax to :Alice :buys :Car .
_:r1 rdf:reifies <<( :Alice :buys :Car )>> ;
:type :Cabriolet ;
:purpose :FunRiding .
_:r2 rdf:reifies <<( :Alice :buys :Car )>> ; # NOT the proposed rdfs:states
:type :Sedan ;
:purpose :Commuting . We do now have different ways to interpret this (and to map it back to N-Turtle-star), following different intuitions:
It should be obvious that
|
So what would you recommend to avoid the issues I illustrate? I made the proposal to map the annotation syntax to |
I can't clearly (as in propositional logic) see what you intuit, but I can imagine it being something like: :Alice :buys :Car
~ :AliceCarPlan {|
a :PurchasePlan ;
:type :Cabriolet ;
:purpose :FunRiding
|}
~ :AliceCarBought {|
a :Purchase ;
:type :Sedan ;
:purpose :Commuting
|} . And that your reading of this syntax deems it problematic. But, as I've written elsewhere, I don't see this as anymore problematic than footnotes in a book not having "written" the text, be it source references or something opposing. I might also model it differently, but that's beside the point. I do sympathize with at least some of your concerns though, and I while I do believe that this is getting close to the philosophical quagmire of justified true beliefs, I have hopes we can address it from a different angle. If w3c/rdf-semantics#49 is, is some form, part of standard entailment (maybe in RDFS), there is a foundation for defining, using OWL, That would actually entail the triple from the reified triple term of _:AliceCarBought a :Purchase ;
:type :Sedan ;
:purpose :Commuting ;
:buyer :Alice ;
:item :Car . (It might also be prudent to define Now, there is still the possible reading of What I do see is potential need for adding a note (perhaps in the primer) to make it clear that this is not implied by the syntax. |
This was discussed durin yesterday's WG meeting https://www.w3.org/2024/10/17-rdf-star-minutes.html |
@niklasl wrote, about @rat10's example about Alice buying a car:
I am not sure it is... As I pointed out earlier, some modeling choices are better than others. Insisting that RDF should let you get away with bad modeling is, in my opinion, not a service to RDF or its users. In my example above about the song "Torn", the modeling mistake is to conflate two distinct recordings of the same song (conflating them with each other, and conflating them with the song itself). Note that I am not claiming that this conflation is inherently bad (there are use cases where it would cause no harm), it is just not fit for purpose given the kind of information that this graph aims to convey. @rat10's example about Alice buying a car suffers, in my opinion, from the same problem: it uses the same triple to describe plans (or actions) of Alice for buying different cars. A symptom of that problem is the need to attach All that being said, if I forget about my unease with this modeling, I totally agree with @niklasl's response. |
From a different perspective, and starting from a (hopefully) common ground: adding extra triples should not impact meaning of prior ones. In @rat10's examples, I think the following point is being made; using the annotation syntax later on changes the meaning of the first set of annotations (uncertain > certain); and this is being reflected in the roundtripping issue. Firstly, I think this is a broader issue; asserting the triple later on, albeit manually or as an effect of the annotation syntax, should not change the meaning of the first set of annotations. So, not just the roundtripping, but also the annotation syntax may be a red herring here. Personally, as it stands, I don't think the asserted triple changes the meaning of the prior triples;
The solution is to always clarify your intended meaning using explicit metadata - cfr. @niklasl (and with the same caveat about modeling as @pchampin mentioned):
Adding the assertion does not change the meaning of the first set of annotations, since we don't assume anything about the rdf:reifies it translates to (e.g., uncertainty). A more elegant solution would indeed be to use a different reification subproperty in the N-triples - such as Possibly related to what souri said during the meeting (based on the notes):
Finally, from @franconi:
I agree. But I think it could be solved by clarifying what the reification and annotation syntaxes mean, and what they don't mean. |
On 18 Oct 2024, at 10:11, William Van Woensel ***@***.***> wrote:
Possibly related to what souri said during the meeting (based on the notes):
> IMO, for keeping RDF simple and concise, the :s :p :o . triple, if present, should not have anything to do with possible presence or absence of :r1 rdf:reifies <<( :s :p :o )>> OR :r2 rdf:reifies <<( :s :p :o )>> (and what I say about :r1 or :r2).
Yes.
As I said several times, a triple in the graph states a true fact, and it is unique, due to the graph being a set of triples.
It is impossible to associate a triple in the graph with a syntactically equal triple term in the graph, since the triple term may appear several times reified in different ways with different intuitively implied truth values.
Finally, from @franconi<https://github.com/franconi>:
>I understand the argumentation of tl and I believe this argumentation is sound.
I agree. But I think it could be solved by clarifying what the reification and annotation syntaxes mean, and what they don't mean.
This can be clarified, as I said several times, by saying that reification is a very general (many-to-many) association mechanism between a triple term and a resource.
This mechanism can be used for many very different purposes: generic annotation, provenance, event or state or resource association, n-ary relations, beliefs, modalities, temporal annotations, etc.
We should have a best practices Section explaining/suggesting how to encode such different use cases.
—e.
|
+1
That's a very good point. Following's @rat10's reasoning, the annotation syntax is not even required to trigger the problem. Using a triple
+1, and to be fair, I think we are all agreeing with that. But we disagree on where/how this metadata should be expressed. Some of us propose that this metadata should take the form of additional triples, while @rat10 proposes that this metadata is (1) carried by
@rat10's arguments against explicit triples is (see above)
I share the intuition that people are less likely to query for all arbitrary reifiers of a given triple, and more likely to query for a specific kind of reifiers. However, following this reasoning, I don't think that the binary distinction between Consider the following variant of the "Alice buying a car" example: << :alice :buys :Cabriolet ~ <#r1> >> a :PurchasePlan; :by :alice; :on "2024-09"; :purpose :funRiding.
<< :alice :buys :Cabriolet ~ <#r2> >> a :Belief; :by :bob ; :on "2024-09".
<< :alice :buys :Cabriolet ~ <#r3> >> a :Doubt; :by :charlie; :on "2024-09".
:alice :buys :Sedan ~ <#r4> {| a :Purchase; :by :alice; :on "2024-10"; :purpose :commuting |}
~ <#r5> {| a :Doubt; :by :bob; :on "2024-09" |}
~ <#r6> {| a :Belief; :by :charlie; :on "2024-09" |}.
# Alice planned in September to buy a Cabriolet for fun,
# but ended up being pragmatic and bought a Sedan in October.
# Back in September, Bob thought that Alice would buy the Cabriolet,
# and was doubtful about her settling for a Sedan.
# Charlie, on the other hand, thought that Alice would end up making the pragmatic choice. I expect that people would be more interested in querying all beliefs or all doubts, rather than only segregating un-asserted vs. asserted annotation. In fact, as the example above shows, some types of reifiers are orthogonal to the assertedness of the reified triple (I can talk about people's wrong belief of wrong doubts). edited: the example was using the same reified twice. This was unintended and has been fixed. |
I fully agree with P-A. |
There is definitely a modelling issue if one wants to have reifications that are somehow supposed to represent truth and other similar ones that do not have this feature. But the solution is to put this information on the reification itself,
instead of having several properties that go from a reification to its triple. If RDF was more powerful better mechanisms would be possible, but RDF is very weak. |
I don't see how anything about RDF (or indeed any formalism) can prevent bad modelling, ranging from using rdfs:subclassOf for instance to conflating multiple things to just stating incorrect facts. |
Sorry for the huge combined comment, but how else am I to respond to all these messages without completely drowning the whole thread in my utterings. I DRY not to repeat myself too often.
This is kind of a circular argument: I am arguing that the information that those brackets seem to convey should be reflected in the data model. That such a meaning is suggested, but not reflected in the data model is indeed the very problem that this proposal addresses. Then people will be able to rely on that what they see - and mean - is what they get.
You're still thinking in terms of types, not occurrences. Some discussions obviously take years to sink in (hinting at your argument from the last WG meeting that this discussion has been dragging on for very long, or too long, already).
No, it's not a symptom of that problem. It was my deliberate decision not to overload this example with the issue of annotating individual nodes, and therefore not using any special property to refer to the object. However, I obviously failed as I still did introduce the issue, just not the proper solution. If you find that explanation still unsatisfying, you might just ignore that part of the example and concentrate on the
I discussed that above already, since @afs made a similar proposal (
You have repeatedly made it clear that you fundamentally oppose the change that this WG is tasked to define. I wonder if that doesn't influence your judgement of this specific proposal here. Also see what RDF 1.0 Semantics (2004) has to say about reification:
IIUC this proposal, while purposefully falling short of a proper semantic extension, points into very much the same direction. I would be open to arguments that this "while ... falling short of a proper..." part is the problem that you'd like to be addressed, but I'm not convinced that you even see the need for such a detailed discussion. @niklasl above
The crucial distinction is that reifiers refer to occurrences, not to the type. While the type of both
Both requirements taken together mean that it has to be represented in the model of RDF.
We can't rely on OWL or other upper levels of the RDF stack to solve a problem this fundamental. Apart from that this seems like a variation of the TEP mechanism: one would need to define an abundance of truth makers, and one would need to define them per statement, right?
Too complicated compared to what? Users of Turtle-star annotation syntax will only feel the difference when they query for annotations on asserted and unasserted statements. OTOH, with the current design all users will always feel the difference, when authoring and when querying, because they will always be forced to explicitly mention "assertedness".
What do you mean by clarifying? "Educating users" through examples? Providing extra vocabulary like
The statement in a graph stands in for all its occurrences: it represents the fact, and the fact exists only once, no matter how often it is uttered. The triple term stands in for the mere possibility of such a statement being made, but it doesn't state it, and even a reifier, defined via
I don't think this is capturing the essential aspect. Sure (and you know about that much better than me) reification is a term with many applications. But the issue here is if an annotation refers to "something" that is considered true, or not. It doesn't matter if that something is an n-ary relation, a generic annotation, a temporal annotation, etc. Even for beliefs one might have some that one considers true, but that is just a secondary aspect. You're mixing categories that probably shouldn't be mixed in this context.
IMO that will not be sufficient (or it will be sufficiently disheartening to drive users away ;-)
Not if
You are free to do so :) I think we agree that such granularity doesn't belong into RDF itself, but it's easy to add as extra attributions via domain ontologies. One may also encode one's whole data as mere reifications, no basic RDF statements (i.e. "facts") whatsoever, and query over those reifications filtered by attributes given in annotations. However, that is very much not the way how data is usually encoded and shared on the semantic web. It is good to make this possible, but it can't be the default arrangement.
(and @franconi expressed agreement)
That probably is an argument rather in favor of this proposal and against its conflation with more fine grained attributions, or isn't it? @afs |
For everyone's sake, I'l try to keep this brief, and focus on what I think is the most important:
I assure you that I'm not, although I see what makes you think I am: my example about the song "Torn" suffers indeed from two problems
Would you consider modelling all of Liz Tailor's 8 marriages as 8 reifiers of a same triple
I have well understood that this is your argument, and I still disagree with it. My disagreement is rooted in how Turtle 1.1 currently works (see my previous comments) so I don't think that it is a circular argument. |
@pchampin I'm also trying to be concise.
If the data focuses on marriages of popular persons then indeed yes I would consider this modelling, because it would make it very easy to find all marriages in which the well known Liz Taylor is involved, and provide easy access to further detail. That is precisely what I find attractive about statement annotation.
Sorry, but I don't understand that argument. What in Turtle 1.1 supports a stance to not map a specific syntactic element to a specific element in the model? |
My point is that, by design, any concrete syntax has features that are irrelevant in the underlying data model. Even N-Triples introduce an order in triples, which is not relevant in the data model. Arguing that something should be reflected in the data model, just because it seems to convey relevant information for the uninformed reader, is a weak argument in my opinion. |
tl;dr
Define a property
rdfs:states
and map the annotation syntax to it,to differentiate between
E.g. map Turtle-star to N-triples-star as follows:
Advise users to only use Turtle-star and Sparql-star ("Turtle-star with holes") when interacting with RDF-star data, unless they are sure they know what they are doing and are prepared to handle "dangling" stated reifications.
Annotating un-asserted statements
RDF standard reification in practice is often used to annotate statements that are not contained in the graph, to e.g. discuss competing viewpoints or document earlier versions. Since the early times of the RDF* CG it has been established that annotating statements without asserting them is an important use case. However, not much detail was provided.
Intuitions
It turned out during discussions that two possible intuitions are at play:
A single truth is what the somehow impoverished semantics of RDF supports: a statement is either in the graph and therefore true, or it's unknown.
Competing truths is what real life suggests, and possible interpretations range from "not yet confirmed" to "not endorsed" to "strongly opposed".
Competing truths
Multiple viewpoints can appear in statements or in annotations.
Reification semantics
Reification maintains a disconnect between
- a statement, which represents a type
- "its" annotation, which refers to an occurrence of that type.
The semantics of reification is tricky, to put it mildly. A reification:
Reification is unavoidable in RDF to manage the constraints that the set semantics impose. There will always be a disconnect between the statement which may be contained in the graph as a type, and the annotations which instead refer to occurrences of that type of statement (instead of "occurrence" one might also call them instance, token or subtype, slight variations in meaning notwithstanding). In general, users should be shielded from these shallows as good as possible.
Unlike what reification provides many use cases ask for a clear and solid connection between a fact and "its" annotation. All use cases of qualification fall in this category, e.g. Wikidata, and many more. In LPG the edge "EXISTS", i.e. there are no unasserted statements in LPG, and the indirection through reification is quite irritating and unwelcome.
There's two ways to bridge that gap
Entailment is not available in simple RDF, so describing "such a statement" is the way to go. However, it's important to realize that the statement itself
:s :p :o .
and a triple term describing it
<<( :s :p :o )>>
and a reference to such a triple so described
:r rdf:reifies <<( :s :p :o )>>
are three different things and, while all similar, there is no way to connect them beyond describing that similarity.
Syntax
We currently have three syntactic primitives, one of them in N-triples to actually implement RDF-star, the other two in Turtle-star to provide appropriate syntactic sugar for users:
<<( :s : p :o )>>
describes the triple:s :p :o.
but that triple is not asserted and can only be referred to by very specific means. A so-called 'reifier' creates a reference to an occurrence of the abstract triple term, e.g.:r rdf:reifies <<( :s : p :o )>>.
. The reifier:r
can then be annotated, e.g.:r :a :b .
. None of this actually states:s :p :o .
- that requires a regular RDF statement to that effect.<< :s : p :o >> :a :b .
defines and annotates the reified statement - however, without actually stating the reified triple.:s :p :o {| :a :b |} .
defines, states and annotates the reified statement in one go.Mapping
There are two possible ways to map the syntactic sugar of Turtle-star to bare triples in N-triples-star.
Depending on the interpretation of what it means for a statement to be un-asserted one of them provides partial support of use cases, the other one near complete support. (Full support can't be claimed because of the limitations dictated by the semantics.)
The mapping as currently defined in RDF-star with only
rdf:reifies
The mapping as proposed here with
rdf:reifies
andrdfs:states
Mapping to N-triples-star with
rdf:reifies
and dis-agreement on annotationMapping to N-triples-star with
rdf:reifies
and dis-agreement on statementsnafu...
This is clearly unsatisfactory. The mapping as currently specified only works for uncontested data. A later addition of a statement first annotated as unasserted may profoundly change the meaning of an annotation. Conflicting viewpoints can not be represented.
But worst of all, the syntactic sugar conveys the illusion that all use cases are covered:
suggests that we can represent and annotate statements in both unasserted and asserted form side by side - which is prerequisite to describe competing viewpoints.
But the essential detail of unstatedness is lost in translation to N-triples - which is prerequisite to storing it in an RDF database - as soon as the triple is added from another viewpoint.
I.e. users are lured into believing they expressed a much richer description than what is actually stored in the back end, and transmitted over the wire.
This is the kind of unpleasant surprise that users tend to not forget, nor forgive.
Work arounds
Some work arounds have been proposed:
However, the syntactic sugar is defined now and can't be re-mapped later.
However, that means adding a lot of triples (most annotations are on asserted statements), i.e. verbosity
and having to query for them (needing one more join), i.e. performance issues. Relying on users to add one more triple to express what seems clear in syntax is a brittle approach anyway.
However, while we agree that all those different reasons do indeed belong in domain ontologies, the basic fact if a statement is considered asserted or not has to be covered by RDF proper.
Pragmatic solution:
rdfs:states
The two syntactic devices to express annotations in Turtle-star
capture the intuition behind all use cases. It's the mapping to N-triples that loses an important aspect
and fails to cover non-simplistic use cases where the world is more than a one-dimensional series of facts.
Hence let them map to different properties in N-triples:
This provides
More formally speaking the properties
rdf:reifies
andrdfs:states
could be defined as follows:(see also An update on [Proposal: described vs stated triple terms])
Entailment ???
But isn't it entailment when reifying a triple term with
rdfs:states
also creates the statement described by the triple term? Strictly speaking the answer might be 'yes', and that would be a problem because we need to specify a solution to statement annotation that works in simple RDF, without RDFS/OWL/etc reasoning.However, the mapping from annotation syntax to
rdfs:states
and back again can be understood as a simple "macro", because it is perfectly predictable and it works on the level of syntax mapping, not in the realm of interpretation.I.e. as long as all interactions are channeled through Turtle-star and Sparql-star everything is safe!
But what to do with the following N-triples-star file containing a "dangling" 'rdfs:states' statement, but missing the actual triple:
_:r2 rdfs:states <<( :Foo :madeOf :Bar )>> ; :says :Bob . # the actual ':Foo :madeOf :Bar.' triple is missing
"Implementations MAY|SHOULD add the missing triple."
For everybody else put up warning signs saying
"Always use Turtle-star and Sparql-star, leave N-triples-star to back office machinery."
Alternatively, drop the syntactic sugar that lures users into false expectations. But that still wouldn't meet all use cases. Go back to "Add
rdfs:states
"...Querying
Just as Turtle-star should be used for authoring, Sparql-star should be used for querying. Users searching for the application of a triple term in both asserted and unasserted form will have to do so explicitly, by using both syntactic forms ("Turtle-star with holes"). Users that just search for annotations on statements that are true in the graph - probably the most common use case - don't have to think twice but just use the annotation syntax, and vice versa for searches of annotations on unasserted triples. At this level no knowledge of the underlying mapping to N-triples-star with its two different properties is required.
However, searching for all instantiations of a triple term in raw triple term form requires nothing more than searching for
rdf:reifies
andrdfs:states
, which is not hard to do either.The text was updated successfully, but these errors were encountered: