MySQL Data Types
Once you have identified all of the tables and columns that the database will need, you should determine each column's data type. When creating the database, as you will do at the end of this chapter, MySQL requires that you define what sort of information each field will contain. There are three primary categories, as is true for almost every database application:
Dates and times
Within each of these, there are a number of variantssome of which are MySQL-specificyou can use. Choosing your column types correctly not only dictates what information can be stored and how but also affects the database's overall performance. Many of the types can take an optional Length attribute, limiting their size (in the corresponding tables, the square brackets, , indicate an optional parameter to be put in parentheses, while parentheses themselves indicate required arguments).
Table 4.1 lists the numeric types. The biggest distinction is between integer types and real numbers (which contain decimal points). After that it's a matter of the range of possible values (for integers) or what level of precision is necessary (for real numbers).
Table 4.1. Here are most of the available numeric column types for use with MySQL databases. For FLOAT, DOUBLE, and DECIMAL, the Length argument is the maximum total number of digits and the Decimals argument dictates the number of that total to be found after the decimal point. (As of MySQL 5.0.3, the size of DECIMAL column is based upon a formula.) MySQL Numeric Types
MySQL Numeric Types
Range of 128 to 127 or 0 to 255 unsigned.
Range of 32,768 to 32,767 or 0 to 65535 unsigned.
Range of 8,388,608 to 8,388,607 or 0 to 16,777,215 unsigned.
Range of 2,147,483,648 to 2,147,483,647 or 0 to 4,294,967,295 unsigned.
Range of 9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 or 0 to 18,446,744,073,709,551,615 unsigned.
A small number with a floating decimal point.
A large number with a floating decimal point.
Length + 1 or Length + 2 bytes
A DOUBLE with a fixed decimal point.
Table 4.2 lists the text types. Mostly these differ in size, but several allow for storing binary data instead of strings of characters. There are also two extensions of the text types that result in a different behaviorENUM and SETwhich allow you to define a series of acceptable values when creating the table. An ENUM field can have only one of a possible several thousand values, while SET allows for several of up to 64 possible values. There are two caveats with ENUM and SET: These types are not supported by other databases, and their usage undermines normalization.
Table 4.2. Here are the most common column types for storing text in a MySQL database. MySQL Text Types
MySQL Text Types
A fixed-length field from 0 to 255 characters long.
String length + 1 or 2 bytes
A fixed-length field from 0 to 255 characters long (65,535 characters long as of MySQL 5.0.3).
String length + 1 bytes
A string with a maximum length of 255 characters.
String length + 2 bytes
A string with a maximum length of 65,535 characters.
String length + 3 bytes
A string with a maximum length of 16,777,215 characters.
String length + 4 bytes
A string with a maximum length of 4,294,967,295 characters.
Similar to CHAR but stores binary data.
Data length + 1 bytes
Similar to VARCHAR but stores binary data.
Data length + 1 bytes
Stores binary data with a maximum length of 255 bytes.
Data length + 2 bytes
Stores binary data with a maximum length of 65,535 bytes.
Data length + 3 bytes
Stores binary data with a maximum length of 16,777,215 bytes.
Data length + 4 bytes
Stores binary data with a maximum length of 4,294,967,295 bytes.
1 or 2 bytes
Short for enumeration, which means that each column can have one of several possible values.
1, 2, 3, 4, or 8 bytes
Like ENUM except that each column can have more than one of several possible values.
The various date and time types (as listed in Table 4.3) have all sorts of unique behaviors, which are documented in the manual and will be discussed throughout the book. You'll primarily use these types without modification, so you need not worry too much about their intricacies.
Table 4.3. These are the available date and time column types for MySQL.
MySQL Date and Time Types
In the format of YYYY-MM-DD.
In the format of YYYY-MM-DD HH:MM:SS.
In the format of YYYYMMDDHHMMSS; acceptable range ends in the year 2037.
In the format of HH:MM:SS.
In the format of YYYY, with a range from 1901 to 2155. Creating a MySQL Database
To choose your data types:
Identify whether a column should be a text, number, or date type.
This is normally an easy and obvious step. If a column will only ever store numeric data, use a number type. If it may contain anything that's not numeric, you'll need to use text.
You will find that numbers such as dollar amounts could be text fields if you include their corresponding punctuation (dollar signs, commas, and hyphens), but you'll get better results if you store them as numbers and address the formatting elsewhere.
Choose the most appropriate subtype for each column.
For improved performance, keep in mind two considerations:
Fixed-length fields (such as CHAR) are generally faster than variable-length fields (such as VARCHAR), but they also take up more disk space. See the sidebar for more information.
The size of any field should be restricted to the smallest possible value, based upon what the largest possible input could be. For example, if the longest product name you might store is 20 characters long, then you would set the column as a VARCHAR(20).
There is a catch, though: if you insert a string five characters long into a CHAR(2) field, the final three characters will be truncated. This is true for any field: if you exceed the column's range, some data will be lost.
With numbers, you'll need to decide whether or not to store a decimal. This breaks your decision into the integer or real number areas. If mathematical precision is important with a real number, use the DECIMAL
type, which is more accurate than either FLOAT
Set the maximum length for text and number columns (Table 4.4
Rather than going over how I defined all 22 columns and why, I've listed the properties I came up with in Table 4.4. Different developers have different preferences, but the most important factor is to tailor each setting to the information at hand rather than using generic (and inefficient) TEXT
types at all times.
Table 4.4. Another aspect of database design is defining the optimal type for each field.
Accounting Data Types
Date Invoice Paid
Client Street Address
Contact Email Address
Expense Category ID
Expense Category ID
Many of the data types have synonymous names: INT and INTEGER, DEC and DECIMAL, etc.
The TIMESTAMP field is automatically set when an INSERT or UPDATE occurs, even if no value is specified for the field. Depending upon your version of MySQL, there are all sorts of other behaviors for TIMESTAMP columns.
The BLOB type is a variant on TEXT that allows for storing binary files in a table. I'll demonstrate this in action in Chapter 12, "Techniques for Programming."
An added benefit of the date and time data types is that MySQL will validate that these are real, possible values, when inserting records. Prior to MySQL 5.0.2, this just meant basic validation: no 13 or higher for months, no 32 or higher for days. Since MySQL 5.0.2, this also means the date must exist, so 2006-02-31 would be an invalid date.
The size required to store any variable-length string of text will also depend upon the character set being use. For example, accented characters or those in non-English languages may require more space.
There is some debate as to the superiority of these two similar types. Both store strings and can be set with a fixed maximum length. One primary difference is that anything stored as a CHAR will always be stored as a string the length of the column (using spaces to pad it). Conversely, VARCHAR strings will be only as long as the stored string.
The two implications of this are
VARCHAR columns tend to take up less disk space.
Unless you are using the InnoDB storage engine (see "Choosing a Storage Engine" later in this chapter), CHAR columns are faster to access than VARCHAR.
Granted, the speed and disk space differences between the two types may be imperceptible in most cases. And you'll also find that you'll define a column using one type but MySQL may automatically use the other type instead, strange as that may seem.
There is also a third, minor difference between these two, prior to MySQL 5.0.3: MySQL trims off extra spaces from CHAR columns when data is retrieved and from VARCHAR when it's inserted. VARCHAR columns no longer have this property as of MySQL 5.0.3, meaning that any extra spaces are kept.
As a rule of thumb, use VARCHAR unless you have a field that will always, or almost always, be the exact same length. This may be a state abbreviation or a product identifier (SD123, PA456, etc.).