capitalone · fdosani · Jan 9, 2025 · Jan 9, 2025
@@ -4,4 +4,6 @@
 - Mark Zhou
 - Ian Whitestone
 - Faisal Dosani
-- Lorenzo Mercado
+- Lorenzo Mercado
+- Jacob Dawang
+- Raymond Haffar
@@ -7,11 +7,19 @@
 ![PyPI - Downloads](https://img.shields.io/pypi/dm/datacompy)
 
 
-DataComPy is a package to compare two Pandas DataFrames. Originally started to
-be something of a replacement for SAS's ``PROC COMPARE`` for Pandas DataFrames
-with some more functionality than just ``Pandas.DataFrame.equals(Pandas.DataFrame)``
-(in that it prints out some stats, and lets you tweak how accurate matches have to be).
-Then extended to carry that functionality over to Spark Dataframes.
+DataComPy is a package to compare two DataFrames (or tables) such as Pandas, Spark, Polars, and
+even Snowflake. Originally it was created to be something of a replacement
+for SAS's ``PROC COMPARE`` for Pandas DataFrames with some more functionality than
+just ``Pandas.DataFrame.equals(Pandas.DataFrame)`` (in that it prints out some stats,
+and lets you tweak how accurate matches have to be). Supported types include:
+
+- Pandas
+- Polars
+- Spark
+- Snowflake (via snowpark)
+- Dask (via Fugue)
+- DuckDB (via Fugue)
+
 
 ## Quick Installation
 

@@ -2,13 +2,7 @@ datacompy Roadmap
 -----------------
 
 At this current time ``datacompy`` is in a stable state. We are planning on continuing to
-add features and functionality as the community of users asks for them, but there are no 
+add features and functionality as the community of users asks for them, but there are no
 pressing issues which we are looking to add in immediately.
 
-There are some longer term issues which are open for people to work on, and some which are more of a nice to have.
-We are looking for contributors and also maintaners to help with the project.
-
-- Add in docs how to change the number of mismatches in report `#6 <https://github.com/capitalone/datacompy/issues/6>`_
-- Make duplicate handling better `#7 <https://github.com/capitalone/datacompy/issues/7>`_
-- Refactor Spark datacompy   `#13 <https://github.com/capitalone/datacompy/issues/13>`_
-- Drop Python 3.7 suport  `#173 <https://github.com/capitalone/datacompy/issues/173>`_
+Please feel free to check the issues section of the repository for the most up to date list.
@@ -5,7 +5,7 @@ Overview
 --------
 
 The main goal of ``datacompy`` is to provide a human-readable output describing
-differences between two dataframes.  For example, if you have two dataframes
+differences between two dataframes. For example, if you have two dataframes
 containing data like:
 
 df1
@@ -289,4 +289,4 @@ There's a number of limitations with ``datacompy``:
 
        #Numpy testing
        npt.assert_array_equal(arr1, arr2)
-       npt.assert_almost_equal(obj1, obj2)
+       npt.assert_almost_equal(obj1, obj2)
@@ -3,12 +3,14 @@ name = "datacompy"
 description = "Dataframe comparison in Python"
 readme = "README.md"
 authors = [
+  { name="Faisal Dosani", email="[email protected]" },
   { name="Ian Robertson" },
   { name="Dan Coates" },
-  { name="Faisal Dosani", email="[email protected]" },
 ]
 maintainers = [
-  { name="Faisal Dosani", email="[email protected]" }
+  { name="Faisal Dosani", email="[email protected]" },
+  { name="Jacob Dawang", email="[email protected]" },
+  { name="Raymond Haffar", email="[email protected]" },
 ]
 license = {text = "Apache Software License"}
 dependencies = ["pandas<=2.2.3,>=0.25.0", "numpy<=2.2.0,>=1.22.0", "ordered-set<=4.1.0,>=4.0.2", "polars[pandas]<=1.17.1,>=0.20.4"]