Skip to content

Commit

Permalink
Merge pull request #7 from SCAI-BIO/fix-negative-int-values-jsd
Browse files Browse the repository at this point in the history
fix: handle negative int64 values in bincount
  • Loading branch information
tiadams authored May 22, 2024
2 parents 4bd9895 + 13761d4 commit 8ae94f0
Showing 1 changed file with 5 additions and 0 deletions.
5 changes: 5 additions & 0 deletions syndat/quality.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,11 @@ def jsd(real: pandas.DataFrame, synthetic: pandas.DataFrame, aggregate_results:
for col_synth in synthetic.columns:
synthetic[col_synth] = synthetic[col_synth].astype(real[col_synth].dtype)
if col_dtype_real == "int64" or col_dtype_real == "object":
# handle negative values for bincount -> shift all codes to positive (will yield same result in JSD)
min_value = min(real[col].min(), synthetic[col].min())
if min_value < 0:
real[col] = real[col] + abs(min_value)
synthetic[col] = synthetic[col] + abs(min_value)
# categorical column
real_binned = np.bincount(real[col])
virtual_binned = np.bincount(synthetic[col])
Expand Down

0 comments on commit 8ae94f0

Please sign in to comment.