I’m using MySQL 8.0 CE. I have an attribute where I want to store a TRUE or FALSE status and I want it to use as little space as possible.
After reading the answers in this question:
MySQL: Smallest datatype for one bit and the MySQL documentation Bit-Value Type – BIT and Integer Types (Exact Value), I understand that at storage level it is better is to use BIT(1) because BOOL is actually a TINYINT(1) and therefore uses full 1byte.
At storage level it is clear that BIT(1) is the best option but, at performance level when searching for true or false?
If I understand correctly BIT would store 1 or 0 while BOOL stores TRUE or FALSE.
That difference makes that when searching between both possibilities one of the types is better optimized for it?
Thanks.
3
Answers
BIT(1)
also requires minimum 1 byte, so you’re not saving any space compared toBOOL
/TINYINT
. Both take 1 byte.Speaking to MySQL developers, they usually wince when I bring up the
BIT
data type. It’s full of known bugs, and likely undiscovered bugs. The internal code is poorly understood. They told me to just useTINYINT
.By the way, MySQL doesn’t have a true
BOOL
type.BOOL
is just an alias forTINYINT(1)
, and there is no true or false value. The "false" value is literally the integer 0, and the "true" value is the integer 1. In other words, you canSUM()
a column that is supposedly boolean, and you get an integer sum equal to the number of rows where the column is "true." This is not compliant with standard SQL (it makes no sense to SUM() a boolean column), but it’s the wayBOOL
is implemented in MySQL.Consideing that in mysql boolean is the same as tinyint(1). That said, boolean always uses 1 byte per column but bit(n) will use as few bytes that are needed to hold the given number of bits. BIT save some space however i would use boolean because it makes things simpler at the moment you want to query a database. in terms you can have values other than 0 or 1 if you are not careful. To avoid this, you can use the aliases TRUE and FALSE.
(In addition to what Bill says…)
I have a Rule Of Thumb: "If a back-of-envelope estimate doesn’t suggest at least a 10% improvement, move on to some other optimization". Couple that with the fact that even if a single-bit bool would shrink the row length by 7 bits, that would probably be less than 1% savings. So, I move on.
OTOH, If you have lots of booleans, then consider the
SET
datatype, which can handle up to 64 ‘booleans’ in 8 bytes or less. But it’s rather clumsy. So is a similar thing with the 64 bits ofBIGINT UNSIGNED
, together with shifting and boolean operators (<<
,&
, etc.)If you need to
INDEX
one or several booleans, forget it. About the only case where indexing works is in a composite (multi-column) index where one of the columns is [effectively]TINYINT NOT NULL