Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resample_cube_spatial should not crop the data spatially #257

Open
clausmichele opened this issue Jun 5, 2024 · 6 comments
Open

resample_cube_spatial should not crop the data spatially #257

clausmichele opened this issue Jun 5, 2024 · 6 comments

Comments

@clausmichele
Copy link
Member

See here: Open-EO/openeo-processes#509

@ValentinaHutter
Copy link
Collaborator

#300

@clausmichele
Copy link
Member Author

@ValentinaHutter the latest implementation of resample_cube_spatial breaks our notebook, could you please have a look? The returned error is: Exception: Coordinates could not be aligned!

but the process is designed exactly to align different resolutions/projections.

https://github.com/EO-College/cubes-and-clouds/blob/main/lectures/2.3_data_access/exercises/23_data_access_resample.ipynb

@ValentinaHutter
Copy link
Collaborator

I now created a new PR to fix this: #314

I used the following code to test the process directly in openeo-processes-dask, but will not merge it in, because for these tests, we do not want to depend on an external stac collection:

import pytest
from openeo_pg_parser_networkx.pg_schema import TemporalInterval, BoundingBox

from openeo_processes_dask.process_implementations.cubes.resample import (
    resample_cube_spatial,
)
from openeo_processes_dask.process_implementations.cubes.load import load_stac


def test_resample_cube_spatial(
):
    url = "https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a"

    spatial_extent = BoundingBox.model_validate({"west": 11.4, "east": 11.42, "south": 45.5, "north": 45.52})
    temporal_extent = TemporalInterval.model_validate(["2023-06-01", "2023-06-30"])
    bands = ["red","nir"]

    s2_cube = load_stac(url=url,
        spatial_extent=spatial_extent,
        temporal_extent=temporal_extent,
        bands=bands
        )
    url = "https://planetarycomputer.microsoft.com/api/stac/v1/collections/landsat-c2-l2"
    bands = ["red","nir08"]
    l8_cube = load_stac(url=url,
        spatial_extent=spatial_extent,
        temporal_extent=temporal_extent,
        bands=bands)

    
    output_cube = resample_cube_spatial(
        data=l8_cube, target=s2_cube, method="average"
    )
    print(s2_cube[s2_cube.openeo.x_dim[0]].values == output_cube[output_cube.openeo.x_dim[0]].values)

    output_cube = resample_cube_spatial(
        data=s2_cube, target=l8_cube, method="average"
    )
    print(l8_cube[l8_cube.openeo.x_dim[0]].values == output_cube[output_cube.openeo.x_dim[0]].values)

    assert (l8_cube.shape == output_cube.shape)

I put this into a python file like tests/test_reample_s2.py and tested it with poetry run python -m pytest tests/test_resample_s2.py

In this test, the resample_cube_spatial should run through successfully and create the matching coordinates - however since the resample_cube_spatial does not crop the data spatially as discussed before, the result will not have the same number of coordinates as the target - so users need to take care of cropping the data themselves.

Another thing to mention, that I was not sure about, was, what to do if the resampled coordinates are exactly shifted by half the resolution compared to the target, so arrays would be for example:

a = [ 687450. 687480. 687510. 687540. 687570 ]
b = [ 687435. 687465. 687495. 687525. 687555. 687585. ]

where a[0] is as close to b[0] as a[0] to b[1] - so the shilf is exactly resolution/2.

@clausmichele
Copy link
Member Author

clausmichele commented Jan 17, 2025

Hi @ValentinaHutter. I'm trying to understand your implementation of the process. To me it appears that you are firstly resampling/reprojecting and then forcing the result to be on a different grid, which should be the same of the target, am I wrong? Because if it is like that, it's conceptually wrong, since the computed values correspond to different points and we can't just modify the associated coordinates. Please correct me if I'm wrong!

In this test, the resample_cube_spatial should run through successfully and create the matching coordinates - however since the resample_cube_spatial does not crop the data spatially as discussed before, the result will not have the same number of coordinates as the target - so users need to take care of cropping the data themselves.

That would be fine, it would be enough to add a filter_bbox step after resample_cube_spatial.

@ValentinaHutter
Copy link
Collaborator

What I was trying to do was to solve the problem mentioned in #300 (comment)
might be that I misunderstood something though.

So, basically the old implementation took the coordinates from the target, resampled the data and cropped it to the target. My first update #300, I did to stop the cropping, and what it did was that it took crs and resolution from the target and resampled the data based on this. This would result in coordinates like the ones in the example above:
a = [ 687450. 687480. 687510. 687540. 687570 ]
b = [ 687435. 687465. 687495. 687525. 687555. 687585. ]

So, what I did in the next PR #306 was to align the coordinates based on the target coordinates.

Writing this, I notice that what I did is kind of like an approximation to the exact values. So, I guess depending on the method this would need to be updated, right? So, the method I used is basically the "nearest" pixel to the target pixel.

Does this make sense to you?

@ValentinaHutter
Copy link
Collaborator

Because if it is like that, it's conceptually wrong, since the computed values correspond to different points and we can't just modify the associated coordinates.

I think, I see, where my misunderstanding came from - my first thought was that if I reproject my data with the resample_spatial implementation and the result is something like:

a = [ 687450. 687480. 687510. 687540. 687570 ]
b = [ 687451. 687481. 687511. 687541. 687571 ]

where a are the coordinates of the target and b are the coordinates of the data, after applying resample_spatial and I would just replace the data coordinates b with the closest ones from the target a - in this example they are shifted by a value of 1 - so I would consider them to be close to each other and assume it to be fine to replace them... but this is not the case for the example with:

a = [ 687450. 687480. 687510. 687540. 687570 ]
b = [ 687435. 687465. 687495. 687525. 687555. 687585. ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants