Fearless PostgreSQL Database Refactoring

The PostgreSQL relational database engine has been offering very powerful features for building databases or data warehouses for a long time now. It can help you build databases and data-warehouses that unlock the value of your data. Features such as foreign keys, constraints, views, generated columns and custom types ensure that you maintain order and correctness in that data.

Sometimes however, the strictness and correctness that PostgreSQL provides you may also hinder you in your wishes to change or refactor the database. When, for example, you have a view that depend on some tables and you want to alter a column type in the table, PostgreSQL will tell you that you can’t do that. This is because PostgreSQL knows the view depends on that data and wants to ensure that it does not break.

The solution is then to follow these steps:

  • Drop the view depending on the table you want to change
  • Alter the table by changing the column data type
  • Recreate the view

Once you know this, it is not too problematic. When the database schema grows and evolves however, it can become more and more complex due to a growing number of dependencies on some core tables, views and materialized views. A growing number of cascading drops and recreate statements can be required to make otherwise trivial changes. When this happens, you can become more and more hasitant to make changes that improve the value of your data. PostgreSQL helps you by pointing out what dependencies are in the way, or you can check the dependencies yourself and define the required drop- and create statements around the change that you want to make, but it becomes very elaborate.

Because at 1Optic we often have to deal with these types of changes, we have developed a small tool that helps us with this. You can find it on GitHub:
https://github.com/hendrikx-itc/pg-tools

You can provide the tool with an SQL alter-statement and it will try to execute it on the database. When PostgreSQL returns an error that states there is a dependency in the way, the dependency is recorded. The necessary drop and create statements are created for that dependency and the new sequence of SQL statements is tried. If it succeeds, you are done! If it fails with another dependency, it keeps on recording dependencies and adding drop and create statements until it succeeds. The tool uses transactions to prevent you from ending up with a half modified and broken database.

The resulting SQL script can be stored in your database migrations or just used in a one type fashion.

Most important is that with a tool like this, you don’t have to be afraid to do trivial or less trivial refactorings on your database schema. And this encourages to get more and more value out of your data.

Go download the tool if you feel it might help you, modify it to your liking (it’s open source), and let us know what you think of it!