---
title: "Fixing a SQL database to have only one type of collation, a Craft CMS story"
date: 2024-05-15T10:01:00-04:00
author: John Morton
canonical_url: "https://supergeekery.com/blog/fixing-a-sql-file-to-have-only-one-type-of-collation"
section: Blog
---
# Fixing a SQL database to have only one type of collation, a Craft CMS story

*May 15, 2024* by John Morton

![Spiderman database collation](https://static.supergeekery.com/site-assets/spiderman-database-collation.jpg)

I'm in the process of updating a site from Craft CMS 4 to Craft CMS 5. As I read through the detailed [upgrade guide](https://craftcms.com/docs/5.x/upgrade.html), I noticed some database related items that deserved close attention. The character set and collation of the database will change during the upgrade. 

Before I did the upgrade, I discovered an issue with the collation type issue in my existing database. I had a mixture of two different collations and I wanted to correct that before I went through with the Craft update. Find yourself in the same boat? Let's dive in.

> Post publication note: While the contents of this post will allow you to fix the problem of having a mixed collation type, I only realized later that Craft CMS comes with a built-in CLI command that will make this simple. Check out the `craft db/convert-charset` CLI command. See this page in the docs: https://craftcms.com/docs/5.x/reference/cli.html#db-convert-charset. If you're not using Craft though, read on for more.

## Beware and be prepared

Before we go any farther, if you're thinking of making these changes your database as you read this post, I encourage you to work *locally* and have a *database backup*. [There be squalls ahead.](https://www.youtube.com/watch?v=UsceDpdSmRY) 🏴‍☠️

### Tools

Here are the tools I'm using for this process:

1. DDEV - the recommended local development tool for Craft CMS
2. TablePlus - a handy GUI for working with databases which has a very close integration with DDEV
3. A code editor, like PHPStorm, VSCode, or Sublime Text, that understands SQL and can color code it for easy visual inspection

## What is your current collation?

In the [Performing the Upgrade section](https://craftcms.com/docs/5.x/upgrade.html#performing-the-upgrade) of the guide, **step 6** jumped out at me. During the upgrade process, we need to specify the database charset and collation. If you've always followed Craft's defaults, you simple add the following to your `.env` file.

```
CRAFT_DB_CHARSET="utf8mb3"
CRAFT_DB_COLLATION="utf8mb3_general_ci"
```

Perhaps you're nodding in agreement as your eyes glaze over. *That's ok! * 😵‍💫

I wasn't sure that the database I was working with had *always* followed Craft's recommendation. The database was from a site I had inherited a while ago, so I needed to confirm what the state of my data was. TablePlus can help with that.

### Checking your database

With your site running locally in DDEV, it's easy to open up your database in TablePlus. On the command line, simply enter:

```
ddev tableplus
```

You should see the table used in your DDEV site appear TablePlus. There are two columns that we need to pay attention to, **charset** and **collation**. 

I was happy to see the that every table's charset in the database from my Craft 4 site said `utf8mb3`. 

Unfortunately, the collation values for the tables were a mix of `ut8mb3_general_ci` and `utfmb3_unicode_ci`. 
![2024 05 14 09 25 04](https://static.supergeekery.com/site-assets/2024-05-14_09-25-04.png)
*TablePlus showing different collations for tables in the same database*
Why would the database have two types of collations? I don't know, but my goal is to get them in sync. The site I'm working on is about ten years old and has has multiple developers over time. It also doesn't matter how the database ended up in this state, I just wanted to fix it.

Since the underlying charset for my tables is the same, `utf8mb3`, the data itself should be in good shape. Problems I've seen in the past with the wrong charset for a table would be with extended characters, like letter with ascents. Luckily, I'm not dealing with that issue here.

The issue I had was that there were two different collations, which can cause [problems](https://tableplus.com/blog/2019/08/illegal-mix-of-collations-mysql.html). For example, MySQL can throw an [Illegal mix of collations](https://stackoverflow.com/questions/1008287/illegal-mix-of-collations-mysql-error) error. 😵‍💫😵‍💫

### What are we talking about here?

But first, what's a collation? Think of the word collate. When you collate data, you're sorting it. Here's how [Wikipedia defines collation](https://en.wikipedia.org/wiki/Collation) generally.

> A collation algorithm such as the Unicode collation algorithm defines an order through the process of comparing two given character strings and deciding which should come before the other. When an order has been defined in this way, a sorting algorithm can be used to put a list of any number of items into that order.

That makes some sense. Collation has to do with the sorting of data. So what's the difference between `ut8mb3_general_ci` and `utfmb3_unicode_ci`? The best answer I found was from [StackOverflow](https://stackoverflow.com/questions/766809/whats-the-difference-between-utf8-general-ci-and-utf8-unicode-ci), which basically says:

* `utf8mb3_unicode_ci` provides more accurate sorting and comparison for a wide range of characters according to the Unicode standard.
* `utf8mb3_general_ci` is faster but less accurate in terms of sorting and comparison, especially for characters with diacritics, i.e., accents.

Faster seems better, right? Also, since `utf8mb3_general_ci` has been the Craft CMS recommendation all along, I'm sold. I want to get everything converted to `utf8mb3_general_ci`. The key here is that we do sorting consistently.

Based on my reading (see reference links at the end of this post), changing the collation should not be an issue. *(You did make a database backup, right?)*
![2024 05 14 15 31 30](https://static.supergeekery.com/site-assets/2024-05-14_15-31-30.png)
*How to use the TablePlus interface to run SQL*
## Using TablePlus to build your SQL

In my case, I had tables and columns both that needed a collation change.

I used two sets of SQL which each generates a set of SQL statements (see the `AS stmt` in the code) that I will run in a final step to do the actual migration. In the code below, be sure to change `database_name` to whatever you DDEV database name is. In my case, that's `db`. You can look at your `.env` file to find the database name in your project.

### Step 1: SQL to generate the migration statements for tables:

```
SELECT 
    CONCAT('ALTER TABLE `', TABLE_NAME, '` CONVERT TO CHARACTER SET utf8mb3 COLLATE utf8mb3_general_ci;') AS stmt
FROM 
    INFORMATION_SCHEMA.TABLES 
WHERE 
    TABLE_SCHEMA = 'database_name' 
    AND TABLE_COLLATION = 'utf8mb3_unicode_ci';
```

### Step 2: SQL to generation the migration statements for columns:

```
SELECT 
    CONCAT('ALTER TABLE `', TABLE_NAME, '` CHANGE `', COLUMN_NAME, '` `', COLUMN_NAME, '` ', COLUMN_TYPE, ' CHARACTER SET utf8mb3 COLLATE utf8mb3_general_ci;') AS stmt
FROM 
    INFORMATION_SCHEMA.COLUMNS 
WHERE 
    TABLE_SCHEMA = 'database_name' 
    AND COLLATION_NAME = 'utf8mb3_unicode_ci';
```

### Step 3: The migration

I ran Step 1 and Step 2 separately, but you should be able to do it in one go if you prefer. 

You should see many lines of code generated when you run these commands. They will contain rows of "ALTER TABLE" statements.

Copy the generated ALTER TABLE statements and paste the code into a new blank document in your code editor somewhere in your project. Name the file whatever you like, but make sure the extension is `.sql`, i.e. `my-database-migration-code.sql`. The extension is important because it will enable color formatting of the code and help you spot any potential problems.

On my first attempt, I did not save the generated statement into a file and closely inspect it. I simply ran the ALTER TABLE statements in TablePlus. The migration was *mostly* successful, but a few of the SQL command generated errors. Seeing the formatted code in a code editor revealed the problem. Here's a screenshot of one of the problematic lines in my code. 
![2024 05 14 15 46 40](https://static.supergeekery.com/site-assets/2024-05-14_15-46-40.png)
*The generated migration code had some lines wrapped in open and closing quotes.*
Each error I saw corresponded to a line in my migration that was wrapped in quotes. I removed the quotes and inspected each line for proper syntax. Each line looked good when I removed the quotes. I saved the result in a new sql file in my project. Next, I restored my database from backup and then 
pasted my newly updated migration code into a new SQL query tab and pressed "Run all" again. 

## A success migration

The fix worked and the migration went through without an error. I wanted to check results. Opening the migrated table in TablePlus showed that all my collation values were now the same. Next, I went through my site to confirm things looked good and didn't find any problems. I opened at least one entry for each section in my site and saw everything rendered properly. 

This is the copy of the database which will *replace* the production database. 

The rest of the migration from Craft 4 to Craft 5 went without a hitch. Once you get a site migrated from 4 to 5 though, be sure to finish your data migration as mentioned in the [Database Character Set and Collation](https://craftcms.com/docs/5.x/upgrade.html) section of the upgrade guide.

You'll need to update your `.env` file with a new charset and collation and run `php craft db/convert-charset` to finish your database migration. 

Good luck!


## Reference links:

* https://craftcms.com/docs/5.x/upgrade.html
* https://en.wikipedia.org/wiki/Collation
* https://stackoverflow.com/questions/766809/whats-the-difference-between-utf8-general-ci-and-utf8-unicode-ci
* https://dba.stackexchange.com/questions/255548/mysql-performance-implications-of-changing-a-column-collation-from-latin1-swedi
* https://sqlserverfast.com/blog/hugo/2019/01/collation-change-script/
* https://dev.mysql.com/doc/refman/8.0/en/charset.html
* https://dev.mysql.com/doc/refman/8.0/en/charset-unicode-utf8mb3.html
* https://tableplus.com/blog/2019/08/illegal-mix-of-collations-mysql.html
* https://medium.com/@nilesh.patil.d/utf8-general-ci-vs-utf8-unicode-ci-what-should-we-use-cc01b58c00cc

---

**Tags:** craftcms

## Related Posts

- [Run your own uptime service: Uptime Kuma with Traefik on Laravel Forge](https://supergeekery.com/blog/uptime-kuma-with-traefik-on-laravel-forge)
- [Make Javascript module creation easier with Vite and automated GitHub Pages and npm publishing](https://supergeekery.com/blog/make-javascript-module-creation-easier-with-vite-and-automated-github-pages-and-npm-publishing)
- [Removing Craft CMS plugins that you can&#039;t uninstall](https://supergeekery.com/blog/removing-craft-cms-plugins-that-you-cant-uninstall)
