Packaging for your database schema, schema migration.

Speaker: Iwan Vosloo

Track: Other

Type: Talk

Room: Main Hall

Time: Oct 05 (Thu): 11:00

Duration: 0:45

Python "distribution packages" result in the wheels you can distribute via PyPI and install using pip ans in so doing provide an excellent way to re-use the code written by others. Each such package contains code and a declaration of which other packages (and which version of such packages) it depends on. This allows one to install a version of a package you wish to use and be assured all its dependencies will also be installed, at their correct versions.

Python distribution packages thus make it easy to dependably re-use code written by others. What about re-using code that rely on a database, and a particular database schema though?

There are tools that allow you to manage database schema and also migrate a database schema to keep up with changing code. However, we are not aware of tools that cater for the possibility that different authors could package code in different distribution packages, each package with its own declaration of what slice of database schema the code it contains need. In this scenario one would have to be able to create a final database schema for a project from the combined set of all the packages in a given final project --- each package providing its own slice of the final database schema. This should be done without the original authors of each of the individual packages having known at the time of authoring their packages which other slices of database schema their package would eventually co-inhabit a database with.

Such packages can also depend on each other on a database schema level: the database schema of one package can have a foreign key referring to the primary key belonging to the schema of another package. On a database level this is similar to one package importing code from another upon which it depends.

Code changes and so would the database schema slice that goes with it. Hence a requirement of this idea is to be able to take a database with existing data in a schema created for a given set of packages and migrate its schema and data to an upgraded set of such interdependent packages.

As a side-line to our experiments with Reahl we have pioneered such database-aware distribution packages as an extension of setuptools. Our "components" not only include database schema metadata, but also configuration and internationalisation information specific to a component.

This talk focuses on the database side of things and presents an overview of the problems involved and how we solved them.

URLs


Python Software Foundation
Thinkst Canary Afrolabs