Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

plots: cleanup data extraction #6355

Merged
merged 2 commits into from
Jul 23, 2021
Merged

Conversation

pared
Copy link
Contributor

@pared pared commented Jul 22, 2021

Thank you for the contribution - we'll try to review it as soon as possible. πŸ™

@pared pared requested a review from a team as a code owner July 22, 2021 10:52
@pared pared requested a review from pmrowla July 22, 2021 10:52
Comment on lines +36 to +39
if extension in (".json", ".yaml"):
return DictData(filename, revision, content)
if extension in (".csv", ".tsv"):
return ListData(filename, revision, content)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After moving data parsing to collect in #5984 we can abstract obtained data.

Comment on lines -71 to -98
def _apply_path(data, path=None, **kwargs):
if not path or not isinstance(data, dict):
return data

import jsonpath_ng

found = jsonpath_ng.parse(path).find(data)
first_datum = first(found)
if (
len(found) == 1
and isinstance(first_datum.value, list)
and isinstance(first(first_datum.value), dict)
):
data_points = first_datum.value
elif len(first_datum.path.fields) == 1:
field_name = first(first_datum.path.fields)
data_points = [{field_name: datum.value} for datum in found]
else:
raise PlotDataStructureError()

if not isinstance(data_points, list) or not (
isinstance(first(data_points), dict)
):
raise PlotDataStructureError()

return data_points


Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not supported from #3840 I recall that we have been talking about keeping it in another form, but anyway, it can always be brought back from git history.

Comment on lines -563 to -576
@pytest.mark.parametrize("data_class", [JSONPlotData, YAMLPlotData])
def test_find_data_in_dict(tmp_dir, data_class):
metric = [{"accuracy": 1, "loss": 2}, {"accuracy": 3, "loss": 4}]
dmetric = {"train": metric}

plot_data = data_class("-", "revision", dmetric)

expected = metric
for d in expected:
d["rev"] = "revision"

assert list(map(dict, plot_data.to_datapoints())) == expected


Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved to test_data

Comment on lines -5 to -22
from dvc.repo.plots.data import _apply_path, _find_data, _lists


@pytest.mark.parametrize(
"path,expected_result",
[
("$.some.path[*].a", [{"a": 1}, {"a": 4}]),
("$.some.path", [{"a": 1, "b": 2, "c": 3}, {"a": 4, "b": 5, "c": 6}]),
],
)
def test_parse_json(path, expected_result):
value = {
"some": {"path": [{"a": 1, "b": 2, "c": 3}, {"a": 4, "b": 5, "c": 6}]}
}

result = _apply_path(value, path=path)

assert result == expected_result
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as before, no need for this test for now.

Comment on lines -43 to -44
def test_finding_data(fields):
data = {"a": {"b": [{"x": 2, "y": 3}, {"x": 1, "y": 5}]}}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expanded scope of the test to test to_datapoints and remove redundancy in tests (test_finding_data was checking basically the same as test_find_data_in_dict).

@@ -57,7 +57,6 @@ def run(self):
"nanotime>=0.5.2",
"pyasn1>=0.4.1",
"voluptuous>=0.11.7",
"jsonpath-ng>=1.5.1",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed after #3840 and #3658

@pmrowla pmrowla merged commit 67b42eb into iterative:master Jul 23, 2021
@efiop efiop added the enhancement Enhances DVC label Jul 23, 2021
@pared pared deleted the 6145_cleanup_plots branch January 4, 2022 10:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhances DVC
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants