The "What" of MySQL ENUMs (as well as the "Why"?)
What is MySQL’s ENUM (or “enumeration”) data type, and does it have any practical use to you? Upon careful examination best practices would suggest its use is very limited and should probably be dismissed all together.
First, what is the purpose of this data type? The folks at MySQL wanted a quick way to provide a possible set of values for a field. This is usually accomplished with a “lookup” table and is a common part of denormalization. In an effort to speed up queries through reduced table joins, the ENUM type was created as a “quick” replacement. But as many developers have experienced, the quick fix is not usually the best. Additionally, giving people tools can lead to their abuse or a lack of knowing how to best use them.
If you’re going to use an ENUM, ideally its use should be limited to values that will never, EVER change. That’s quite a promise, since one never knows what the future has in store. Part of being a developer is to plan for the future in our designs and to make maintenance, modifications and enhancements as efficient as possible. The ENUM type does not lend itself to those goals.
Pros & Cons of ENUM
First, an examination of the benefits of using the field:
Faster than a table join and reduces complexity in the overall database design by moving a lookup table into another table.
Easier to see values when examining data in the database since they’re not foreign key ID’s.
And that’s about it for the benefits. On the other hand when it comes to examining disadvantages, you can’t reduce them to a short bullet point list.
MySQL Internal ENUM Handling – MySQL made ENUMs a lookup table within yet another table. Say that to yourself a couple of times and then ask if it sounds like a good idea. Internally, ENUMs values are assigned an index according to the order in which they were entered. What’s worse is that you can actually refer to these indexes in a query.
For example, if the ENUM values were “dog”, “cat”, “sheep” and “moose”, internally they would have the ID’s 1,2,3 and 4. So if you wanted to assign dog you “could” do it with:
UPDATE table SET enum_field = 2. Not that somebody would likely do that, but they could. And if they can, that’s not a good thing.
But the issue is really highlighted if your ENUMs are numeric. Imagine ENUM values of “0”,”1”,”2”,”3” and “4”. Value “0”would have an index of 1, “1”an index of 2, etc. Now imagine that you write a query:
UPDATE table SET enum_field = 1. The value is then “0” and that’s not good. You would have to remember to quote your numeric value to have the intended value correctly set.
Another related annoyance is how MySQL will sort the column. It does not sort by the values, but instead sorts by the internal index. So with the “dog, cat, . . .” example from above, if you sort they will be sorted exactly as they’re entered with dog first, cat second, etc.
ENUM Defaults – MySQL made it so that ENUMs always have a default of empty string that has an internal index of 0. So if you submit a query and try to set an ENUM field to a value that is not in the ENUM list, MySQL does a couple of bad things. First, it will not complain to you that the value you’ve submitted is not valid. And second, it will gladly write the row using the default value of the empty string. So a typo or bug in your program (or even if somebody changes one of the ENUM values in the table) could result in captured information being lost or not written to the database for quite some time before being discovered.
Portability – Portability is the notion that you can write code or use libraries that abstract the database layer. This allows you to move a system to another environment that may be using the MSSQL database. Since ENUMs are proprietary to MySQL, that is a problem. You’re going to have to do some database redesigning.
ENUM List Modification – What happens if you need to add, delete or modify an existing value in the ENUM list? First, if the table is large you’re going to have much difficulty since MySQL will need to rewrite the entire table to add the new value. Second, what if you make a mistake when altering the ENUM list and mistype an existing value or leave one out? MySQL will convert all existing rows with said value to use the default.
Associated Attributes or Variant Values – ENUMs can’t store associated data. Take for example storing marital state in an ENUM (“Mr.”, “Mrs.” or “Ms.”). What if you want additional details about each of those values like the full spelling versus the abbreviation? Or what if down the line you wanted to track when somebody’s marital status changed?
Yet another example would be a state or province table with a country ENUM. What if you wanted extra details about the countries such as capital, continent, square kilometers, etc. With either example if the values had instead been stored in a lookup table, another column could easily be added to store such attributes and variants.
When can ENUMs be Used?
There are some simple ENUM examples that others might suggest are okay to use. Say you have a field that should store “yes”or “no”. The database designer could set up a field with ENUM values of “yes” and “no” or “Y” and “N”. This is a bool (aka “Boolean Value”) and should simply be a TINYINT with values of 1 or 0. But don’t rely on the database if you want to make it “pretty” for presentation. If you must have it returned as ‘yes’ or ‘no’ then transform it in your query:
IF(bool_field = 1, ‘yes’, ‘no’).
What about something as simple as the gender designations male and female? That’s concrete and you could never imagine it changing, correct? However, issues similar to the marital status pop up such as an abbreviation or other forms like “man” or “woman” or the plurals of “men” and “women”. Yes those can be handled in code, but consistency is a good thing. You don’t want different parts of the program setting these values. Instead, a unified set of abbreviations and variants should be used throughout.
These “workable” examples are quite simple just like the “Mr.”, “Mrs.” or “Ms.” example and still they can suffer from problems. Given the advantages and disadvantages, it appears that using ENUMs is generally never a good idea.
Alternatives to MySQL ENUMs
If you’re disturbed by the number of tables in a database and want to avoid lookup tables, one alternative is to implement your own version of ENUMs with a single lookup table. Create a table “enum” with fields “id”, “namespace” and “value”. You may want to adjust those example names to avoid MySQL keyword collisions or make sure to use backticks (I prefer non-collision naming).
Note that you do not want to use “table” instead of “namespace” because you may have more than one table that would use the same enum. Within the table you would have the enum field of “marital_status” for example and then within the lookup table you would have the list of possible values. For the namespace name you might use the field name as the namespace. This does not address associated values or variants. If you start getting into that requirement, you should create a separate dedicated lookup table in such cases.
The bottom line is that despite MySQL’s ENUMs being workable in limited situations, they are not practical. And based on their many disadvantages, should be avoided all together since other more advantageous alternatives exist.
Written by Project Manager & Technical Analyst Brian Miller with minor editing by Digital Media Coordinator Casey.