Edit Rename Changes History Upload Download Back to Top

GemStone schema migration

The notes apply to the OpenSkills MMS and SkillsBase systems.

The "database" in a Smalltalk systems is the collection of "model objects" that represent the persistent infomation. Each model object can be thought of as being the equivalent of a row in a table in a relational database. For example in the MMS there are account objects and a payment objects, and in the SkillsBase there are skill and engagement objects.

The state and behavior of an object are defined by its class, much as the state held in a relational database table is defined using a DDL create table statement. The schema of a relational database is defined by the collection of DDL that creates all the tables (etc) and the schema of an object database is defined by the classes that make up the model.

As the MMS and SkillsBase are extended over time the data model will change and the classes that implement the model must change with it. A change may mean adding or removing an instance variable or renaming the class. In some cases information that was held in one object is moved out into many smaller objects.

Schema migration is the process of updating the model objects in line with the latest version of their respective classes.

What Classes?

All of the classes that form the object model in the OpenSkills systems are held in a single SymbolDictionary in GemStone. In the SkillsBase system the model SymbolDictionary is called "OSSkillsBaseModel".

The following code fragment returns the model objects for the SkillsBase sorted in by class name (note that not everything in the SymbolDictionary is a class):

(OSSkillsBaseModel select: [:aValue| aValue isClass])
		asSortedCollection: [:x :y | x name < y name]
When migrating from an old schema to a new schema we run the above script against the old SymbolDictionary to get the list of classes we are migrating from. If we want to retain the data objects for a class (and we nearly always do) we need to say what class in the new schema will take the place of the old class. We use the above code in a script that generates a code fragment which maps old classes to new classes by name:
|targetStream classes|
targetStream := WriteStream on: String new.
targetStream nextPutAll: '^(OrderedCollection new: 20)'; cr.
classes := (OSSkillsBaseModel select: [:aValue| aValue isClass])
		asSortedCollection: [:x :y | x name < y name].
classes do: [:aClass | 
	targetStream nextPutAll: '	add: #';
	nextPutAll: aClass name;
	nextPutAll: ' -> #';
	nextPutAll: aClass name;
	nextPutAll: ';';
	cr].
targetStream nextPutAll: '	yourself'; cr.
targetStream contents

The resulting generated code looks something like this example from the SkillsBase:

^(OrderedCollection new)
	add: #OSSBAbstractSkill -> #OSSBAbstractSkill;
	add: #OSSBAvailability -> #OSSBAvailability;
	add: #OSSBCommendation -> #OSSBCommendation;
	...
	yourself

This code just creates a collection of associations mapping the old class name to the same class name. It is a manual process to update the names on the right to be the new names where appropriate. In practice class names mostly stay the same.

Class Version

In GemStone every class has is part of a class history. Objects being migrated can only migrate to another class in the same class history.

The code that does the migration accepts the collection of associations created above and adds the new class to the version history of the old class making the new class the most recent in the version history. The method that does this is >>addNewVersion, so it looks like:

	oldClass addNewVersion: newClass

So we have the name of the old and new classes but we need the classes themselves to make the new class be the most recent in the version history of the old class. When the migration code is run we login to the GemStone account which contains the new classes and we have a reference to the SymbolDictionary holding the old classes.

Migrate Instances

Now we have a handle on what classes need to be migrated and what classes they are to be migrated to, it's just a matter of doing the migration.

The word "migration" here applies to what the instances of the classes must do, not what the classes must do. Every object knows it's class. Before the migration all the model objects claim that their class is one of the old ones. The act of migrating an object is telling it that it is now an instance of a different class, the new class in the schema.

Here is the code that we use to do the migration:

	| sourceClass targetClass sourceClassHistory oldClasses |
	sourceClass := self sourceSymbolDictionary at: sourceClassName.
	targetClass := System myUserProfile symbolList objectNamed: targetClassName.
	sourceClass addNewVersion: targetClass.
	sourceClassHistory := sourceClass classHistory.
	oldClasses := sourceClassHistory copyFrom: 1
				to: sourceClassHistory size - 1.
	oldClasses 
		do: [:anOldClass | anOldClass migrateInstancesTo: targetClass].

Here the old class is called sourceClass and the new class is called the targetClass. Notice that the targetClass is added to the version history of sourceClass, and that *all* of the previous versions of the class are asked to migrate their instances to the targetClass. This causes each model object which is an instance of an old class to be told that it is now an instance of the new class, targetClass.

That's it. After this method has been called for each old class / new class association, the entire model (and thus all data in the database) has been migrated to the new schema.

... but we may have lost some data in the process. Read on.

Migrating instance variables unchanged

By default, when an object is migrated from one class to another, all instance variables are copied by name. So if the old class has instance variables a b and c and the new class has the same instance variable names then no data will be lost. If an instance variable name has changed, however, we must take steps to ensure that we don't lose the data.

If the name of an instance variable has change, or if an instance variable is being removed we can use the following class side method:

instVarMappingTo: anotherClass 
	^self 
		instVarMapFrom: anotherClass
		to: self
		withExplicitMappings: (Array
			with: #oldName1 -> #newName1
			with: #oldName2 -> #newName2
			...)

The method returns an array of integers with one integer for each instance variable in the target class. Note that instance variable names are not used in an instVarMap, rather the index of the instance variable in the class definition is used (horrible).

Examples:

Note that in the above method we have added the ability to give explicit mappings using iv names. Much clearer. Mapping which are not explicitly defined in this way use the default, e.g. 1 -> 1, 2 -> 2 etc.

Changing instance variable values


Edit Rename Changes History Upload Download Back to Top