-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for #123 - Add support for preservation of unknown properties #124
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Left just a few comments that are here more for discussion.
@@ -122,6 +136,8 @@ public void deserialize(T into, ObjectReader reader, Context ctx) { | |||
} else { | |||
reader.skipValue(); | |||
} | |||
} else if (unknownPropertyHandler != null) { | |||
unknownPropertyHandler.onUnknownProperty(into, propName, ctx.genson.deserialize(UNKNOWN, reader, ctx)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means that if a property handler is registered it is responsible of being able to handle missing properties for all targeted types.
Another option could be to actually implement failOnMissingProperty
as a special UnknownPropertyHandler, and chain them. That would however complicate a bit UnknownPropertyHandler as it would need to "signal" if it did handle the missing property or not. Just food for thought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see handler and failOnMissingProperty
as mutually exclusive options, or rather, the latter is just a special case of the former, which we want to leave as-is for backwards compatibility, but isn't strictly necessary -- anyone can write a handler that fails immediately, if that's what they want to.
More likely than not, people will use either one UPH (probably the default one we provide) or none, and if they do use one they will not want to fail on missing properties, so I really don't see a point of complicating the design. In the unlikely event that they do need chained/composite UPH, they can easily write one themselves that does exactly what they need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it's a special case of unknown property handling, is the main reason why I was saying it could be reimplemented that way. The advantage I see to it is, unified logic and removal of unnecessary code.
Preserving some of the compatibility could be achieved by translating the boolean failOnMissingProperty
into a handler at the builder level (and remove it from the other places like Genson and BeanDescriptor, which are less likely to be directly used).
I agree that the chaining part is not necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see... Yes, that would probably make sense.
We just need to document that behavior, but considering that they are already mutually exclusive, we should probably document it anyway.
Let me rework the code along those lines and I'll update the PR.
Map<String, Object> unknownProperties(); | ||
} | ||
|
||
static class EvolvableHandler implements UnknownPropertyHandler { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth providing a simple impl like that as part of the lib.
In that case I think it might be better to not mutate the result of unknownProperties (for ex. this would then fail for an immutable map) and instead just define a addProperty method. Also some alternative namings could be PropertyBag, WithUnknownProperties, UnknownProperties.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I (mostly) agree :-)
The reason I didn't do it is because any useful implementation must make certain assumptions about how the data class is implemented (ie. it must implement Evolvable
and expose a holder map). If Java had mix-ins this would be a no-brainer, but unfortunately it doesn't, so it's really a question of whether we think an interface provides enough value to include it, and to create a default UPH implementation. If you feel we should, I'm not against it.
As for not mutating property map directly, it can get complicated. I originally had what you are suggesting, a pair of addUnknownProperty/getUnknownProperties
methods. However, in that case you'd end up with the unknown properties serialized twice -- once as unknownProperties
map, because of the getter, and once by the BeanDescriptor
because they are, well, unknown properties.
In the end I decided that the simplest solution was to expose the mutable map via a single method that won't impact normal serialization. Now, the issue is still there -- if the field-based serialization was enabled, we'd have to mark the field itself with @JsonIgnore
or make it transient. Again, a lot of this depends on personal preferences, and for test purposes I just did the easiest thing that worked. If we decide to add a standard implementation into Genson, we should probably discuss how it should look like and what the external contract and semantics should be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that a default impl would be useful if we were able to address the points you mention.
That way the user would only have to implement the contract without needing to also implement the handler and we could mitigate the double ser. problem.
- To address the later problem we could provide an abstract class instead of an interface (mark the field as transient and annotate the methods). This would still be annoying for people that can't extend the class but would work in every case for all the others.
OR/AND (we can do both)
- Have an interface with JsonIgnore (or use names that don't match std get/set patterns like you did in this test) and as a bonus fix this issue Merge JsonIgnore like annotations #85 to merge JsonIgnore with subclasses.
Fixing this issue is not trivial, so that could be done in a second time later on. The main problem is due to the fact that exclusion of properties happens before merging of annotations on all discovered properties. This is an useful feature to have...
OR
- Do something like this hacky idea from https://stackoverflow.com/a/25907755/840441. Though this is probably too hacky for what we need :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can certainly do 1) and 2) and provide both an interface and an abstract base class that implements it.
I'm not a big fan of base classes, as most Java developers are not happy when you force them to extend something and take away their only opportunity for inheritance, but I guess providing it as an option and not something that's required softens the blow a bit -- they can always choose not to use our base class and implement the interface directly in their own class(es).
The option 3) scares me... ;-)
There is one issue with this PR that we still need to address. At the moment, I am reading and deserializing unknown properties just as I would if the were known, but there is a subtle difference. With unknown properties, there is a chance that an older version of an app will attempt to read it, and find I see two ways around it:
Thoughts? |
Wouldn't option 1) make it close to useless from an usage point of view? As it's unlikely that the user will be able to do anything with it. Also if the user was to programmatically do something with the unknwon properties, it will be hard for him to know if this is a simple string or actually a raw json value. Also it feels weird that the same document would be deserialized to different target types depending if the class is present or not. Maybe unknwon properties should always be deserialized to a This makes me think that none of that would be customizable at the handler level. Perhaps this is where it would make more sense to do all this? |
Well, in case of option 1), if you don't know what the data is and aren't even aware it exists, there probably isn't much you can do with it anyway, so it's not as big of an issue as it seems and completely avoids deserializing something you likely don't care about into potentially large object graph that is nothing but garbage. A simple String or even byte[] tend to work just fine for temporary storage. That said, I'm not quite sure how to implement it. It seems like the best option would be to have I am not opposed to option 2) either, but in that case what options do you have in mind for preventing converters from kicking in? Finally, I do agree with the last comment -- it would be nice if we could let the handler itself decide how to read and write unknown properties, instead of us making the decision for it. Maybe something along the lines of: public interface UnknownPropertyHandler {
/**
* Called whenever a property is encountered in a JSON document
* that doesn't have a corresponding {@link PropertyMutator}.
* <p>
* Typically, the implementation of this interface concerned
* with schema evolution will handle this event by storing
* property value somewhere so it can be written by the
* {@link #writeUnknownProperties} method.
*
* @param target the object we are deserializing JSON into
* @param propName the name of the unknown property
* @param reader the ObjectReader to read property value from
* @param ctx deserialization context
*/
void onUnknownProperty(Object target, String propName, ObjectReader reader, Context ctx);
/**
* Write unknown properties encountered during deserialization.
* <p>
* This method can be optionally implemented by {@code UnknownPropertyHandler}s
* that want to write unknown properties during serialization. The default
* implementation is a no-op.
*
* @param source the object we are serializing into JSON
* @param writer the ObjectReader to read property value from
* @param ctx serialization context
*
* @return a map of unknown properties
*/
default void writeUnknownProperties(Object source, ObjectWriter writer, Context ctx) {
}
} Thoughts? |
I made the changes based on the discussion above, and will submit a new PR. Closing this one as obsolete. |
The low level streaming code is probably the less friendly part in Genson. Initially the impl was much simpler, but then I went to a few extremes trying to improve speed to match up with Jackson. Things that were optimizations a couple years back are probably less useful today... I think 2) will be the easiest to achieve. The few options that come to mind would be to:
For the last point the target object might not be yet constructed. Also I'm not sure about not writing back unknown properties, it seems like a nice property to be able to round trip data. |
BTW you don't have to open a new PR for the changes. You can just update your branch. Force pushing to it is fine as anyway it's your work branch that no one else is pushing to. |
Yeah, I know... The main reason I created a new PR was because I originally committed fix to my master instead of a branch by accident, which makes it hard to sync with the upstream master or work on other, unrelated issues. Anyway, it's all cleaned up and moved now, see #127 for details. |
Had to revert back to Java 1.7 in order for Travis tests to pass, but it will work with Java 8 as well, once #122 is merged.