-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ENUM type into DBAL 4 #6465
Comments
I think of that article every time somebody mentions enums. Out of the 8 reasons it lists, which ones exactly would you say are no longer valid/fixed and why? |
Well, 1. Data isn't being treated like dataThe critique that ENUM misplaces data by embedding it in schema metadata rather than storing it as standard data overlooks its purpose of enforcing data integrity. ENUM is beneficial for fields with a stable and limited set of values, ensuring that only valid, predefined options are entered. This is akin to other data constraints that prevent invalid data types or null values in specific fields, thus maintaining data quality and consistency without additional application-layer checks. 2. Changing the member list of ENUM columns is very expensiveThe concern that altering ENUM lists in MySQL is resource-intensive has been addressed in more recent versions. MySQL 8.0 and later offer improved capabilities for managing ALTER TABLE operations that involve ENUMs. These versions have been optimized to handle certain schema modifications more efficiently, potentially without requiring a full table rebuild, depending on the specific change and context. In MySQL 8.0.17 and newer, certain types of ALTER TABLE operations can be performed instantly — specifically, those that don't modify the storage requirements of the underlying ENUM values. For instance, adding new ENUM values that don't cause the number of potential values to exceed 255 (the limit for a single byte representation) can often be handled without a full table copy. However, operations that change ENUMs from a one-byte to a two-byte representation because the list exceeds 256 distinct values still require a more involved process. Moreover, the application of InnoDB’s online DDL capabilities allows for many 3.It's impossible to add additional attributes or related infoThe criticism that Moreover, if we store the value as a varchar (which is current approach in Doctrine) and not as a JOIN TABLE, this reason applies equally, so 4. Getting a list of distinct ENUM members is a pain.Well this is debatable. 5. ENUM columns may only offer limited or negligible effects on optimization.The argument that ENUM's performance optimization is negligible until a certain scale is reached overlooks the specific contexts and environments where ENUM's efficiencies become significant. Efficient Storage and Access: ENUM types are stored as integers internally, which can lead to more efficient data storage and faster access compared to strings, especially in tables with a large number of rows and frequent access patterns that involve these ENUM columns. This storage efficiency reduces the database size and can enhance performance by minimizing I/O operations, which is crucial for high-volume transaction systems. Reduction of Joins: While modern SQL databases handle joins efficiently, eliminating unnecessary joins can simplify query execution plans. This is particularly beneficial in read-heavy databases where reducing the number of joins can directly impact the response time and system load. ENUMs eliminate the need for joins to reference tables for each query, thereby simplifying the underlying SQL and potentially reducing the execution time for queries involving these ENUM columns. In conclusion, while ENUM may not always offer significant performance benefits over reference tables, in scenarios where the dataset and query patterns align with ENUM's strengths, it can provide both performance improvements and schema simplifications. Therefore, the decision to use ENUM should be driven by specific needs and empirical data rather than a one-size-fits-all approach. 6. You can't reuse the member-list of an ENUM column in other tables.The limitation of not being able to reuse ENUM member lists across tables is a valid point but underscores ENUM's intended use for specific, unchanging data sets. ENUM is optimal for attributes with a stable and consistent set of values, enhancing performance by reducing storage and ensuring data integrity without the complexities of join operations. For cases requiring shared and dynamic lists across tables, reference tables are more suitable. The choice between ENUM and reference tables should be based on the specific stability and performance needs of the application, balancing simplicity and flexibility. 7. ENUM columns have noteable gotchas.The concern that ENUM columns do not robustly enforce data integrity when invalid values are inserted—resulting in the insertion of an empty string (' ') or the numeric index being confused with the value—highlights a need for careful ENUM usage, yes. However just because a data type behaves a bit "strange", does not mean it behaves unexpectedly. As with other types, you have to know what you do when using this specific data type. Many other types have their own quirks. E.g. Strict SQL Mode: When strict SQL mode is enabled in MySQL, the database will reject invalid ENUM values outright instead of substituting them with an empty string or incorrect index. This behavior aligns closer to what you would expect from using a reference table with foreign keys, as it enforces data integrity by preventing invalid data entry at the database level. 8. ENUM has limited portability to other DBMS.Yes, true. There is
I think that it should be up to the developer to decide whether to use ENUMs or not. For me personally, the data inside my DB pose higher value to me than the application. Also, maybe my application is not the only one that will write to the database, and while I can make sure I always write in proper "allowed" values into my |
I would be open to adding an enum type to DBAL. If you want to work on this, please go ahead. But be warned that this won't be a trivial feature. If we add this type, we need to do it right. This means that you will need to build an extensive functional (!) and portable (!) test suite for this feature, covering:
This is an abstraction layer with support for multiple DBMS. While building this feature, you will debug issues with databases you've never worked with before. Be prepared to spin up a DB2 in Docker. ✌🏻
Absolutely not. The old type comments won't help you here. We must be able to introspect a table that has no comments. And for sure, we must handle the case that the deployed This means that the To me, using an enum in MySQL and using one in PHP are two related but different concerns. I'd like to avoid luring people into thinking that they need to switch the whole database to enum columns just because they use enums in PHP. Mapping a PHP enum to a VARCHAR or INT column is still a good solution. That is especially the case when you have to assume that the list of enum cases might change regularly. Also, mapping a MySQL ENUM to a PHP string can be a valid thing to do, especially if you need to operate on a database with a dynamic/unknown schema. Also, when you think about it: The ORM is already able to handle the PHP enum hydration, so for a first iteration, it might just be enough to have an EnumType that maps to PHP string. Regarding that article: It still is valid after all this time even though MySQL has gotten better. The article does not tell us to avoid MySQL ENUMs at all cost. It lists valid reasons for not using it. This means, after taking all of those points into account, we can still make a well-informed decision in favor of a MySQL ENUM.
Still valid! I've seen projects where enums are abused for things like "car manufacturer" or "department". And yes, that is data and changing data should not result in the need to altering your schema. If you need integrity checks on this, foreign key constraints got you covered.
Okay, now it might be expensive and it's not very transparent to the user under which circumstances this operation is expensive. We still need to take into consideration, how often the list of enum values might be changed in the future.
Still valid. 🤷🏻♂️
It is! But that's complexity we can bury inside our schema manager or platform classes.
Yes. If you only do it for storage optimization, use a
Valid! Change an enum that you've reused in five tables and you need to alter five tables. Yay.
Valid, but… 🤷🏻♂️
And here we are at the biggest concern that this type raises for a database abstraction layer. If we build a feature here, it must be portable! A new type that is exclusive to MySQL would be a no-go.
Yes. We're not here to lecture anyone. Using ENUMs in MySQL and friends can be fine. But it's important to me that it's not the natural equivalent to a PHP enum. |
side note: in doctrine ORM, backed PHP enums can already be used in entities (mentioned here in the doc). for the use case of using PHP enums in ORM entities, you don't need enums on database level. i comment this so people who look for enums in ORM don't get confused. supporting database enums on dbal level would have merit. it would allow to use PHP enums to define the database enum configuration, and would simplify the enum handling explained here in the doc. |
Doctrine DBAL is - as you pointed out yourself- not suited to support database level enums. It goes against a few architectural choices we made for So from my POV ENUM support as you outlined with comments is a no merge. |
@michnovka also, it feels that your wall of text comment is AI generated, so if it is, i find it disrespectful of you for our time to have us read so much text and expect us to reply to you in a structured way when you let a machine do the thinking for you. |
@derrabus I've given it some thought over night and did research into This idea came out of doctrine/migrations#1441 because Doctrine not recognizing I think I will focus on fixing the migration issue and overall improvement of https://github.com/Elao/PhpEnums which adds the Doctrine
Fixing migrations to compare actual SQL of old schema instead of generating an "expected" SQL based on deduced type will allow me to use custom ENUM type for my use-case and we will not bloat DBAL with MySQL specific code. @beberlei I did not generate the text with AI. I was trying to make a strong argument for ENUMs and did research, which I was compiling in Word, so unlike my other replies usually typed on phone it had some structure and grammar/syntax checks. I am sorry if it was too long, I wanted to make a comprehensive argument in one post. |
Thank you for your answer and for opening this discussion. I'm going to close the issue now. |
@michnovka, you mentioned using https://github.com/Elao/PhpEnums instead. But it also generate |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Feature Request
When DBAL came out first, there were no PHP enums. Now we have them, they are widely used and enum support in ORM has increased in past versions. However we still map them to varchar or other string types. Most databases support ENUM types, and the performance issues with ENUMs have been debunked (seriously, most of the criticism and hate for ENUM types stems from this article from 2011, and most of the claims are no longer valid). or fixed in the past years.
All supported SQL platforms support
ENUM
types (or alternative for SQLiteCHECK
) and for those that do not, a string could be leveraged same way it is now.I would like to add enum support natively into DBAL
Summary
This will be tricky mainly because unlike other types enums have not length, but sets of values. We will need to extend
Doctrine\DBAL\Schema\Column
and add?array $_enumOptions
field.Another issue is that the
EnumType
class should return different types inconvertToPHPValue
, though all will inherit fromBackedEnum
And most diffucult problem, which is one I stumbled on whilst working on doctrine/migrations#1441 , - from PHP point of view, every
EnumType
is a different class. If you haveColorEnumType
andTasteEnumType
they generateENUM('red','black')
andENUM('salty','sweet')
but DBAL has no idea how to map these to the rightEnumType
. It gets even worse if you have 2 PHP enum types, one forColorEnumType
andAnotherColorEnumType
both with just values red and black, at which point even if DBAL could guess the propery type based on theENUM
values, it will be confused here.The ONLY solution I see is to add back comments with typehint. At least for ENUMs. It makes sense here.
If we add comments back, I can even see a way to avoid touching the
Doctrine\DBAL\Schema\Column
and make it work.This is a complex task, I am willing to work on it, but before I start, I would like to have a consensus that this is the way we can go, and that adding comments back is something thats not a no-go. We can really use comment just for enums, so that the
EnumType
would know to look there, while it would be ignored for other types.The text was updated successfully, but these errors were encountered: