-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
possible to diff lists of dicts? #131
Comments
Can you provide a smaller example with current and expected results? There are some limitations to list comparison:
|
Thanks. The limitations that you noted are definitely present in the data that I'm working with. It that's expected behavior right now, that's fine. I'll need to use something other than dictdiffer for the project. |
I am confused by the statement: ”it can not detect changes in the order of elements”. Since: Can you expand a bit? |
Now I explained it to myself. The ideal result, at least for some use cases, would be some kind of a ”swap” operation, instead of the two ”changes”. |
Yes, exactly. Sometimes, the order of dicts inside the list doesn't matter. You should be able to reproduce my original issue by creating a second list of dicts (based on the one that I provided), and making a few minor changes:
|
I think your first issue is covered by @jirikuncar’s explanation of list diffs - everything is ”different” from the addition onward. Your second issue I think stems from the fact that dictdiffer does not compare dicts by identity, but by content - if you have shuffled the list, changing the first element, dictdiffer detects the difference in specific content within the dict, not in its identity. If you find a better solution for your needs, please share your findings here as well. |
I have been experimenting with storing JSON structures in Git to do diffs on big nested objects. You could do something like: $ pip install git_json_tree
$ ipython import subprocess
import git_json_tree
repo = git_json_tree.Repo('/tmp/git_json_tree')
sha_original = git_json_tree.encode(repo, ["your example goes here"])
sha_modified = git_json_tree.encode(repo, ["your modified example goes here"])
subprocess.call(['git', 'diff', sha_original, sha_modified]) |
@jirikuncar I'm running into a similar issue with comparing both lists and sets... list_a = ["changeA", "changeB"]
list_b = ["changeA", "changeC", "changeB"] Dictdiff correctly finds the I tried converting the list into sets (since they are unordered and unindexed): set1 = set(["changeA", "changeB"])
set2 = set(["changeA", "changeC", "changeB"]) But dictdiff is still reporting a <class 'list'>: [
('add', '', [(0, {'changeC'})]),
('change', '', ({'changeA', 'changeB'}, {'changeA', 'changeC', 'changeB'}))
] |
@danielduhh I have tried with lastest
I can confirm that the result for sets looks wrong. Can you submit a PR with test case and expected results? |
@danielduhh can you check if #133 is solving your problem? |
FYI, this will be useful for those of us wishing to implement Strategic Three Way Merge Patch for the Kubernetes API :) for things like deployments, where you can have big lists of environment variables that look like so: {
"name": "MEDIA_URL",
"valueFrom": {
"configMapKeyRef": {
"key": "MEDIA_URL",
"name": "my-secret"
}
}
} AFAIK the as of dictdiffer 0.8.1, I'm still getting the wrong results for lists of dicts like this. I don't think you could solve this in a truly generalized way, but allowing schema-based diff could be a good approach. |
@Datamance basically you would like to provide a custom |
That could work, and I think you could use that as a building block for the common case of schema-based (particularly JSONSchema) diffing/patching. |
So basically one shoud provide mapping from def diff(..., key=None):
# after dotted_node ~ :158
if dotted_node in key:
for item in key[dotted_node](first, second, node=node, ignore=ignore, path_limit=path_limit, expand=expand, tolerance=tolerance, dot_notation=dot_notation):
yield item
return |
Any updates on this? |
@aorumbayev feel free to send a PR with a fix discussed above. |
I'm trying to use dictdiffer to get the diff of two lists of dictionaries, but its not working well. It seems to get confused whenever the ordering of the dicts inside the list is not identical between the two lists, or when the number of dictionaries in one list is (slightly) different from the other list.
Is this functionality something that is currently possible and expected to work, or am I trying to accomplish something unsupported?
In case it matters, this is an example of a list of dictionaries that I'm trying to work with:
The text was updated successfully, but these errors were encountered: