Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MySql type VARCHAR with *_bin collation incorrectly interpreted as VARBINARY Column #3387

Open
DrewMcArthur opened this issue Jul 27, 2024 · 16 comments · May be fixed by #3400
Open

MySql type VARCHAR with *_bin collation incorrectly interpreted as VARBINARY Column #3387

DrewMcArthur opened this issue Jul 27, 2024 · 16 comments · May be fixed by #3400
Labels

Comments

@DrewMcArthur
Copy link

Is your feature request related to a problem? Please describe.
I have a column in my MySql DB: Column: FileName, Collation: utf8mb4_0900_bin, Definition: varchar(255), which when I try to convert to a struct:

#[derive(sqlx::FromRow)]
#[sqlx(rename_all = "PascalCase")]
struct Row {
  file_name: String
}

throws an error:

error occurred while decoding column "FileName": mismatched types; Rust type
  | `alloc::string::String` (as SQL type `VARCHAR`) is not compatible with SQL
  | type `VARBINARY`

Describe the solution you'd like
This conversion should work, since the actual type is a VARCHAR, not VARBINARY

Describe alternatives you've considered
I could store these variables as Vec<u8>, or manually impl FromRow, but that's obviously not ideal.

Additional context
I assume the collation is what's causing sqlx to interpret the type as VARBINARY?

@DrewMcArthur DrewMcArthur added the enhancement New feature or request label Jul 27, 2024
@abonander
Copy link
Collaborator

The problem is that the only way we know the difference between Vec<u8> and String is by checking the BINARY flag: https://github.com/launchbadge/sqlx/blob/main/sqlx-mysql/src/protocol/text/column.rs#L203

This affects the macros because they use TypeInfo::compatible() to decide what Rust type to map to.

@abonander
Copy link
Collaborator

Alternatively, we could special-case for certain _bin collations. Ironically we'd be re-adding similar checks to those we just removed in #2652.

@DrewMcArthur
Copy link
Author

hm. so i see on line 170 of that file

        let is_binary = flags.contains(ColumnFlags::BINARY);

but I don't think this is necessarily correct - based on 12.8.5 of the mysql docs

Binary strings (as stored using the BINARY, VARBINARY, and BLOB data types) have a character set and collation named binary. ...
Nonbinary strings (as stored using the CHAR, VARCHAR, and TEXT data types) have a character set and collation other than binary.

So it seems like VARCHAR columns with a ..._bin collation are being treated as VARBINARY since that ColumnFlags::BINARY is set. I didn't find where that comes from after a quick look, but I think all that's needed is a bit more specificity there, in order to more closely match up with what the MySQL docs are saying in what I quoted above.

This is low priority since I'm able to just manually convert String::from_utf8(Vec<u8>) for now, but this sounds like a fun problem to mess around with, if you wouldn't mind my contribution & giving me some guidance along the way!

@DrewMcArthur DrewMcArthur changed the title Decode MySql type VARBINARY to String MySql type VARCHAR with *_bin collation incorrectly interpreted as VARBINARY Column Jul 29, 2024
@abonander
Copy link
Collaborator

So it seems like VARCHAR columns with a ..._bin collation are being treated as VARBINARY since that ColumnFlags::BINARY is set. I didn't find where that comes from after a quick look, but I think all that's needed is a bit more specificity there, in order to more closely match up with what the MySQL docs are saying in what I quoted above.

That's something I've really come to loathe about MySQL. The documentation generally runs the gamut from decent but incomplete, to misleading, to nonexistent, and nothing does what it says on the tin. We just recently addressed a similar issue that had to do with MYSQL_TYPE_ENUM not actually being the proper type designation for enums.

@abonander
Copy link
Collaborator

cc @alu (from #2652)

@DrewMcArthur
Copy link
Author

@alu from your comment on #3390,

It seems to work as expected if BINARY is also accepted by compatible

I think this is the source of the issue for me. When I SHOW EXTENDED FULL FIELDS FROM File;, the column in question looks like:

Field Type Collation ...
FileName varchar(255) utf8mb4_0900_bin ...

So I think the issue is that this collation returns the BINARY flag, that doesn't mean it isn't compatible with the Rust String type. If the collation is literally binary (per my comment above), then yes it isn't compatible, but if the collation is *_bin (like this column), it should be compatible.

billy1624 added a commit to SeaQL/sea-schema that referenced this issue Aug 2, 2024
billy1624 added a commit to SeaQL/sea-schema that referenced this issue Aug 2, 2024
@alu
Copy link
Contributor

alu commented Aug 2, 2024

@abonander @DrewMcArthur
Could these suffixes be considered interchangeable as well?

  • *_ai/*_as (Accent Insensitive/ Accent Sensitive)
  • *_ci/*_cs (Case Insensitive/Case Sensitive)

In other words, are they considered incompatible only if collation is binary?

@DrewMcArthur
Copy link
Author

@alu I don't know about interchangeable per se, but they should all be compatible with Rust's String type, yes.

@DrewMcArthur
Copy link
Author

DrewMcArthur commented Aug 2, 2024

here's another tidbit of helpful documentation: 13.3.4 The BLOB and TEXT Types

BLOB values are treated as binary strings (byte strings). They have the binary character set and collation, and comparison and sorting are based on the numeric values of the bytes in column values. TEXT values are treated as nonbinary strings (character strings). They have a character set other than binary, and values are sorted and compared based on the collation of the character set.

From this, I would think that the ColumnType::Blob wouldn't be compatible here, unless those are stand-ins for a separate (but, as the docs say, nearly identical) ColumnType::Text

If you use the BINARY attribute with a TEXT data type, the column is assigned the binary (_bin) collation of the column character set.

I assume "the BINARY attribute" refers to ColumnFlags::BINARY. So really, I think that compatible function should look more like this:

    fn compatible(ty: &MySqlTypeInfo) -> bool {
        // TODO: Support more collations being returned from SQL?
        matches!(
            ty.r#type,
            ColumnType::VarChar
                | ColumnType::Blob
                | ColumnType::TinyBlob
                | ColumnType::MediumBlob
                | ColumnType::LongBlob
                | ColumnType::String
                | ColumnType::VarString
                | ColumnType::Enum
        ) && !ty.collation == Collation::BINARY
    }

of course, this requires adding back a lot of the PR that abonander mentioned earlier, but I think removing collation from consideration conflated ColumnFlags::BINARY with Collation::BINARY. Since TEXT with a utf8_bin collation would have the ColumnFlags::BINARY flag, and would be treated as an incompatible BLOB, which isn't quite right

billy1624 added a commit to SeaQL/sea-schema that referenced this issue Aug 9, 2024
* Bump SQLx to 0.8

* Bump sea-query

* Fix clippy warnings

* Blocking Issues: upgrade to SQLx 0.8 on hold, just testing a workaround, not an ideal fix (launchbadge/sqlx#3387)

* GetMySqlValue

* Bump sea-query
@abonander abonander added bug db:mysql Related to MySQL macros and removed enhancement New feature or request labels Aug 16, 2024
@abonander abonander pinned this issue Aug 16, 2024
@yuyang-ok

This comment was marked as spam.

@alvico
Copy link

alvico commented Oct 17, 2024

I am also getting this error, is there any workaround?

@brianheineman
Copy link
Contributor

I am also getting this error, is there any workaround?

I was able to revert to version 0.7.4 to workaround the issue.

@alvico
Copy link

alvico commented Oct 18, 2024

sadly this is not an option for our stack atm

@DrewMcArthur
Copy link
Author

@alvico you might be able to use my branch. just replace the version in your Cargo.toml to point at the branch for my pull request attached to this issue, i believe it's #3400?

@alvico
Copy link

alvico commented Oct 23, 2024

sadly is not an option for us since getting it from your PR is making the compile complain that we need tokio-runtime feature and does not work even if I added it explicitly everywhere we have a toml file so, we'll need to wait for your release I guess.

@Jansora
Copy link

Jansora commented Nov 13, 2024

I am also getting this error, is there any workaround?

I was able to revert to version 0.7.4 to workaround the issue.

same error found in 0.8.2.
err:error occurred while decoding column "source_type": mismatched types; Rust type alloc::string::String (as SQL type VARCHAR) is not compatible with SQL type VARBINARY

down to 0.7.4 work well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants