valid_percent in utils/get_array_statistics not correctly calculated, when using feature coverage array #774

MarcelCode · 2025-01-02T12:10:07Z

If I calculate the coverage based on a geometry and hand it over to the get_array_statistics method, I would expect that the valid_percent is calculated based on the coverage array instead of the bounding box of the geometry.

In this example valid_pixels is correctly calculated, but valid_percent is wrong:

2 valid pixels, because one pixel is nodata.
As the geometry includes three pixels, I would expect that valid_percent needs to be 66.67 %, instead it is 50 %.

Here is a full code running example:

from rio_tiler.io import Reader

def calculate_statistics(filename: str, shape: dict) -> dict:
    with Reader(filename) as src:
        data = src.feature(shape)

        coverage_array = data.get_coverage_array(shape)

        return data.statistics(coverage=coverage_array)

if __name__ == "__main__":
    file = "https://rastless-tests.s3.eu-central-1.amazonaws.com/TUR_us-newyork_013032_EOMAP_20190424_153304_LSAT8_m0030_32bit.tif"

    aoi = {"type": "Polygon", "coordinates": [
        [[-74.046461218997848, 40.63946290226923], [-74.046464353686744, 40.638997869937697],
         [-74.045818607775814, 40.638991923211712], [-74.045817040431388, 40.639200058305974],
         [-74.046160288864613, 40.639204815671974], [-74.046161856209054, 40.63946290226923],
         [-74.046461218997848, 40.63946290226923]]]}


    statistics = calculate_statistics(file, aoi)

    valid_pixels = statistics["b1"]["valid_pixels"]
    valid_percent = statistics["b1"]["valid_percent"]

    print(f"Valid pixels: {valid_pixels}. Expected value: 2.0")
    print(f"Valid percent: {valid_percent} %. Expected value: 66.67 %")

From what I see in the code this line seems to be wrong, as it always calculates the valid_percent based on the full array size:
utils.py/get_array_statistics line 136

valid_percent = round((valid_pixels / data[b].size) * 100, 2)

My suggestion to change it to:

valid_percent = round((valid_pixels / np.count_nonzero(coverage)) * 100, 2)

Even if no coverage is handed over by the user the calculation is correct, because then it is created inside the method.

I would be happy to create a PR if you agree.

vincentsarago · 2025-01-03T11:15:54Z

thanks @MarcelCode I think the change proposed makes sense 🙏

happy to review a PR

…percent statistics

…tatistics (#775)

MarcelCode added a commit to MarcelCode/rio-tiler that referenced this issue Jan 5, 2025

Fix issue cogeotiff#774: Use coverage array for calculation of valid_…

2261812

…percent statistics

MarcelCode mentioned this issue Jan 5, 2025

Fix issue #774: Use coverage array for calculation of valid_percent #775

Merged

vincentsarago pushed a commit that referenced this issue Jan 6, 2025

Fix issue #774: Use coverage array for calculation of valid_percent s…

ac4492d

…tatistics (#775)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

valid_percent in utils/get_array_statistics not correctly calculated, when using feature coverage array #774

valid_percent in utils/get_array_statistics not correctly calculated, when using feature coverage array #774

MarcelCode commented Jan 2, 2025 •

edited

Loading

vincentsarago commented Jan 3, 2025

valid_percent in utils/get_array_statistics not correctly calculated, when using feature coverage array #774

valid_percent in utils/get_array_statistics not correctly calculated, when using feature coverage array #774

Comments

MarcelCode commented Jan 2, 2025 • edited Loading

vincentsarago commented Jan 3, 2025

MarcelCode commented Jan 2, 2025 •

edited

Loading