Migrations

Overview

With MigrateDB all changes to the database are called migrations. Migrations can be either versioned or repeatable. Versioned migrations come in 2 forms: regular and baseline.

Versioned migrations have a version, a description and a checksum. The version must be unique. The description is purely informative for you to be able to remember what each migration does. The checksum is there to detect accidental changes. Versioned migrations are the most common type of migration. They are applied in order exactly once.

Repeatable migrations have a description and a checksum, but no version. Instead of being run just once, they are (re-)applied every time their checksum changes.

Within a single migration run, repeatable migrations are always applied last, after all pending versioned migrations have been executed. Repeatable migrations are applied in the order of their description.

By default both versioned and repeatable migrations can be written either in SQL or in Java and can consist of multiple statements.

MigrateDB automatically discovers migrations on the filesystem and on the Java classpath.

To keep track of which migrations have already been applied when and by whom, MigrateDB adds a schema history table to your schema.

Versioned Migrations

The most common type of migration is a versioned migration. Each versioned migration has a version, a * description* and a checksum. The version must be unique. The description is purely informative for you to be able to remember what each migration does. The checksum is there to detect accidental changes. Versioned migrations are applied in order exactly once.

Versioned migrations are typically used for:

  • Creating/altering/dropping tables/indexes/foreign keys/enums/UDTs/…
  • Reference data updates
  • User data corrections

Here is a small example:

CREATE TABLE car (
    id INT NOT NULL PRIMARY KEY,
    license_plate VARCHAR NOT NULL,
    color VARCHAR NOT NULL
);

ALTER TABLE owner ADD driver_license_id VARCHAR;

INSERT INTO brand (name) VALUES ('DeLorean');

Each versioned migration must be assigned a unique version. Any version is valid as long as it conforms to the usual dotted notation. For most cases a simple increasing integer should be all you need. However MigrateDB is quite flexible and all these versions are valid versioned migration versions:

  • 1
  • 001
  • 5.2
  • 1.2.3.4.5.6.7.8.9
  • 205.68
  • 20130115113556
  • 2013.1.15.11.35.56
  • 2013.01.15.11.35.56

Versioned migrations are applied in the order of their versions. Versions are sorted numerically as you would normally expect.

No Undo Migrations?

While the idea of undo migrations may seem nice, unfortunately it breaks down in practice. As soon as you have destructive changes (drop, delete, truncate, …), you start getting into trouble. And even if you don’t, you end up creating home-made alternatives for restoring backups, which need to be properly tested as well.

Undo migrations assume the whole migration succeeded and should now be undone. This does not help with failed versioned migrations on databases without DDL transactions. Why? A migration can fail at any point. If you have 10 statements, it is possible for the 1st, the 5th, the 7th or the 10th to fail. There is simply no way to know in advance. In contrast, undo migrations are written to undo an entire versioned migration and will not help under such conditions.

An alternative approach which we find preferable is to maintain backwards compatibility between the DB and all versions of the code currently deployed in production. This way a failed migration is not a disaster. The old version of the application is still compatible with the DB, so you can simply roll back the application code, investigate, and take corrective measures.

This should be complemented with a proper, well tested, backup and restore strategy. It is independent of the database structure, and once it is tested and proven to work, no migration script can break it. For optimal performance, and if your infrastructure supports this, we recommend using the snapshot technology of your underlying storage solution.

Repeatable Migrations

Repeatable migrations have a description and a checksum, but no version. Instead of being run just once, they are (re-)applied every time their checksum changes.

This is very useful for managing database objects whose definition can then simply be maintained in a single file in version control. They are typically used for

  • (Re-)creating views/procedures/functions/packages/…
  • Bulk reference data reinserts

Within a single migration run, repeatable migrations are always applied last, after all pending versioned migrations have been executed. Repeatable migrations are applied in the order of their description.

It is your responsibility to ensure the same repeatable migration can be applied multiple times. This usually involves making use of CREATE OR REPLACE clauses in your DDL statements.

Here is an example of what a repeatable migration looks like:

CREATE OR REPLACE VIEW blue_cars AS
    SELECT id, license_plate FROM cars WHERE color='blue';

SQL-based migrations

Migrations are most commonly written in SQL. This makes it easy to get started and leverage any existing scripts, tools and skills. It gives you access to the full set of capabilities of your database and eliminates the need to understand any intermediate translation layer.

SQL-based migrations are typically used for

  • DDL changes (CREATE/ALTER/DROP statements for TABLES,VIEWS,TRIGGERS,SEQUENCES,…)
  • Simple reference data changes (CRUD in reference data tables)
  • Simple bulk data changes (CRUD in regular data tables)

Naming

In order to be picked up by MigrateDB, SQL migrations must comply with the following naming pattern:

Versioned Migrations

Prefix  Separator       Suffix
                      
   V2__Add_new_table.sql
              
 Version    Description

Repeatable Migrations

Prefix Separator       Suffix
                     
    R__Add_new_table.sql
               
           Description

The file name consists of the following parts:

  • Prefix: V for versioned (configurable), B for baseline (configurable), and R for repeatable migrations (configurable)
  • Version: Version with dots or underscores separate as many parts as you like (Not for repeatable migrations)
  • Separator: __ (two underscores) (configurable)
  • Description: Underscores or spaces separate the words
  • Suffix: .sql (configurable)

Optionally versioned SQL migrations can also omit both the separator and the description.

The configuration option validateMigrationNaming determines how MigrateDB handles files that do not correspond with the naming pattern when carrying out a migration: if false then MigrateDB will simply ignore all such files, if true then MigrateDB will fail fast and list all files which need to be corrected.

Discovery

MigrateDB discovers SQL-based migrations from one or more directories referenced by the locations property.

  • Unprefixed locations or locations with the classpath: prefix target the Java classpath.
  • Locations with the filesystem: prefix search the file system.

New SQL-based migrations on the file system are discovered automatically through scanning at runtime. MigrateDB will automatically pick up any new SQL migrations as long as they conform to the configured naming convention. This scanning is recursive. All migrations in non-hidden directories below the specified ones are also picked up.

Migrations on the class path are discovered during build-time by the respective Maven or Gradle plugin.

Syntax

MigrateDB supports all regular SQL syntax elements including:

  • Single- or multi-line statements
  • Single- (–) or Multi-line (/* */) comments spanning complete lines
  • Database-specific SQL syntax extensions (PL/SQL, T-SQL, …) typically used to define stored procedures, packages, …

Additionally in the case of Oracle, MigrateDB also supports SQL*Plus commands.

Placeholder Replacement

In addition to regular SQL syntax, MigrateDB also supports placeholder replacement with configurable pre- and suffixes. By default it looks for Ant-style placeholders like ${myplaceholder}. This can be very useful to abstract differences between environments.

See Placeholders.

Java-based migrations

Java-based migrations are a great fit for all changes that can not easily be expressed using SQL.

These would typically be things like

  • BLOB & CLOB changes
  • Advanced bulk data changes (Recalculations, advanced format changes, …)

Naming

In order to be picked up by MigrateDB, Java-based Migrations must implement the JavaMigration interface. Most users however should inherit from the convenience class BaseJavaMigration instead as it encourages MigrateDB’s default naming convention, enabling MigrateDB to automatically extract the version and the description from the class name. To be able to do so, the class name must comply with the following naming pattern:

Versioned Migrations

Prefix  Separator
      
   V2__Add_new_table
              
 Version    Description

Repeatable Migrations

Prefix Separator
     
    R__Add_new_table
               
           Description

The class name consists of the following parts:

  • Prefix: V for versioned migrations, B for baseline migrations, R for repeatable migrations
  • Version: Underscores (automatically replaced by dots at runtime) separate as many parts as you like (Not for repeatable migrations)
  • Separator: __ (two underscores)
  • Description: Underscores (automatically replaced by spaces at runtime) separate the words

If you need more control over the class name, you can override the default convention by implementing the JavaMigration interface directly.

This will allow you to name your class as you wish. Version, description and migration category are provided by implementing the respective methods.

Discovery

MigrateDB discovers Java-based migrations on the Java classpath during build-time and places index files in the package referenced by the locations property.

Checksums and Validation

Unlike SQL migrations, Java migrations by default do not have a checksum and therefore do not participate in the change detection of MigrateDB’s validation. This can be remedied by implementing the getChecksum() method, which you can then use to provide your own checksum, which will then be stored and validated for changes.

Sample Class

package db.migration;

import migratedb.1.core.api.migration.BaseJavaMigration;
import migratedb.1.core.api.migration.Context;

import java.sql.PreparedStatement;

/**
 * Example of a Java-based migration.
 */
public class V1_2__Another_user extends BaseJavaMigration {
    public void migrate(Context context) throws Exception {
        try (PreparedStatement statement =
                 context
                     .getConnection()
                     .prepareStatement("INSERT INTO test_user (name) VALUES ('Obelix')")) {
            statement.execute();
        }
    }
}

Take care that your Java migration does not close the database connection, either explicitly or as a result of a try-with-resources statement.

Spring

If your application already uses Spring and you do not want to use JDBC directly you can easily use Spring JDBC’s JdbcTemplate instead:

package db.migration;

import migratedb.1.core.api.migration.BaseJavaMigration;
import migratedb.1.core.api.migration.Context;
import org.springframework.jdbc.core.JdbcTemplate;
import org.springframework.jdbc.datasource.SingleConnectionDataSource;

/**
 * Example of a Java-based migration using Spring JDBC.
 */
public class V1_2__Another_user extends BaseJavaMigration {
    public void migrate(Context context) {
        new JdbcTemplate(new SingleConnectionDataSource(context.getConnection(), true))
            .execute("INSERT INTO test_user (name) VALUES ('Obelix')");
    }
}

Transactions

By default, MigrateDB always wraps the execution of an entire migration within a single transaction.

Alternatively you can also configure MigrateDB to wrap the entire execution of all migrations of a single migration run within a single transaction by setting the group property to true.

If MigrateDB detects that a specific statement cannot be run within a transaction due to technical limitations of your database, it won’t run that migration within a transaction. Instead it will be marked as non-transactional.

By default transactional and non-transactional statements cannot be mixed within a migration run. You can however allow this by setting the mixed property to true. Note that this is only applicable for PostgreSQL, Aurora PostgreSQL, SQL Server and SQLite which all have statements that do not run at all within a transaction. This is not to be confused with implicit transaction, as they occur in MySQL or Oracle, where even though a DDL statement was run within a transaction, the database will issue an implicit commit before and after its execution.

Manual override

If necessary, you can manually determine whether or not to execute a migration in a transaction. This is useful for databases like PostgreSQL and SQL Server where certain statements can only execute outside a transaction.

For Java migrations, the JavaMigration interface has a method canExecuteInTransaction. This determines whether the execution should take place inside a transaction. You can rely on BaseJavaMigration’s default behavior to return true or override canExecuteInTransaction to execute certain migrations outside a transaction by returning false.

For SQL migrations, you can specify the script configuration property executeInTransaction. See Script config.

Important Note

If your database cleanly supports DDL statements within a transaction, failed migrations will always be rolled back (unless they were marked as non-transactional).

If on the other hand your database does NOT cleanly supports DDL statements within a transaction (by for example issuing an implicit commit before and after every DDL statement), MigrateDB won’t be able to perform a clean rollback in case of failure and will instead mark the migration as failed, indicating that some manual cleanup may be required. You may also need to run repair to remove the failed migration entry from the schema history table.

Query Results

Migrations are primarily meant to be executed as part of release and deployment automation processes and there is rarely the need to visually inspect the result of SQL queries.

There are however some scenarios where such manual inspection makes sense, and therefore MigrateDB will display query results in the usual tabular form when a SELECT statement (or any other statement that returns results) is executed.

Toggling query results

To prevent MigrateDB from displaying query results, set the configuration option outputQueryResults to false.

Schema History Table

To keep track of which migrations have already been applied when and by whom, MigrateDB adds a special schema history table to your schema. You can think of this table as a complete audit trail of all changes performed against the schema. It also tracks migration checksums and whether or not the migrations were successful.

Read more about this in our getting started guide on how MigrateDB works.

Schema creation

By default, MigrateDB will attempt to create the schemas provided by the schemas and defaultSchema configuration options. This behavior can be toggled with the createSchemas configuration option.

This might be useful when you want complete control over how schemas are created.

The createSchemas option and the Schema History Table

MigrateDB requires a schema for the schema history table to reside in before running a migration. When createSchemas is false, it will be impossible for the schema history table to be created, unless a schema already exists for it to reside in.

So, given a configuration like this:

migratedb.createSchemas=false
migratedb.schemas=my_schema

The following can happen if createSchemas is false:

  • Run migrate
  • my_schema is not created by MigrateDB
  • Because my_schema is the default schema, MigrateDB attempts to create the schema history table in my_schema
  • my_schema does not exist, so the operation fails

Therefore, when toggling createSchemas to false, the following setup is recommended:

  • Set the default schema to migratedb_history_schema
    • Either by setting defaultSchema, or placing it first in the schemas configuration option
  • Set initSql to create migratedb_history_schema if it doesn’t exist
  • Place your other schemas in the schemas property

So, given a configuration like this:

migratedb.createSchemas=false
migratedb.initSql=CREATE SCHEMA IF NOT EXISTS migratedb_history_schema
migratedb.schemas=migratedb_history_schema,my_schema

The following will happen:

  • Run migrate
  • initSql is executed, so migratedb_history_schema is created
  • Because migratedb_history_schema is the default schema, MigrateDB attempts to create the schema history table in migratedb_history_schema
  • my_schema is not created by MigrateDB
  • Migrations run as normal
  • Migrations are free to control creation of my_schema

Migration States

State Description
Pending This migration has not been applied yet
Success This migration succeeded
Ignored This migration will not be considered when running migrate
   
Deleted This is a migration that has been marked as deleted by repair
   
Above Target This migration has not been applied yet and won’t be applied because target is set to a lower version
Baseline This migration has baselined this DB
Below Baseline This migration was not applied against this DB because the schema history table was baselined with a higher version
Missing This migration succeeded and could not be resolved
Failed (Missing) This migration failed and could not be resolved
Failed This migration failed
Failed (Future) This migration failed and its version is higher than the schema history table’s current version
   
Future This migration succeeded and its version is higher than the schema history table’s current version
Out of Order This migration succeeded but it was applied out of order. Rerunning the entire migration history might produce different results!
Outdated This is a repeatable migration that is outdated and should be re-applied
Superseded This is a repeatable migration that is outdated and has already been superseded by a newer one

Callbacks ➡️