Archive for December, 2009

Collection owner not associated with session? Not quite.

Tuesday, December 29th, 2009

I hate when this happens. I upgraded to NHibernate 2.0 and then quickly afterward to 2.1.0 (you guessed it: because of LINQ). I had to change a couple of things to support it in my company’s application framework and it all seemed to work well – until I discovered that deleting any entity that has a one-to-many relation with cascade=”all-delete-orphan” stopped functioning. It died with a cryptic error message of “collection owner not associated with session”… If I changed to cascade=”all” it worked, but this is not the point, it wasn’t broken earlier. Of course, I tried looking all over the web and apart from a page in Spanish (which wouldn’t be helpful even if it was in English) came up blank. Tried moving to NHibernate 2.1.2 – which is not that simple since we’re using a slightly modified version of NHibernate (a reason more to suspect that the solution to this problem would be hard to find). So here’s a short post for anyone stumbling upon a similar problem.

In the end, I traced it to this behaviour: the collection owner is not found in the session because NHibernate tries to find it using ID = 0, while it’s original ID was 48. The logic is somewhat strange here, because the method receives the original collection owner (which is in the session), retrieves its ID (which was for some reason reset to 0) and then tries to find it using this wrong ID. Moreover, there’s a commented-out code that says “// TODO NH Different behavior” that would seem to do things properly (I checked it, it’s still standing in the NHibernate trunk as is). But the real reason why this happened is that blasted zero in the ID: further debugging (thankfully, there’s a full source for NHibernate available), revealed that it was reset because “use_identifier_rollback” was turned on in the configuration. Well… I probably set this to experiment with it and forgot. Turning it off solved the problem for me… Luckily, I didn’t really need this rollback functionality – as it’s not exactly what it seems to be: it doesn’t rollback identifiers when the transaction is rolled back, it rolls them back when entities are deleted! Why the second feature made more sense to implement than the first one is a mystery to me…

Replicating self-referencing tables and circular foreign keys with Microsoft Sync Framework

Friday, December 11th, 2009

Self-referencing tables – or, at least circular foreign key references between tables – are probably a common thing in all but the simplest database designs. Yet Microsoft Sync Framework doesn’t have a clear strategy on how to replicate such data. I found various suggestions on the net: order rows so that the parent records come before children – this is usable for self-referencing tables (although not endorsed by Microsoft because the framework doesn’t guarantee it will respect this order), but not nearly good enough for circular references – if you have two rows in two tables pointing at each other, ordering them cannot solve the problem. On an MSDN forum there was a suggestion to temporarily disable foreign key constraints: this I cannot take seriously because it opens my database to corruption, all it takes is one faulty write while the constraint is down and I have invalid data in the database (unless I lock the tables before synchronization, and I’m not sure how to do this from within the Sync Framework).

So, when all else fails, you have to sit and think: what would be the general principle for solving this, Sync Framework notwithstanding? Exactly – do it in two passes. The problem is present only when inserting rows, if the row contains a reference to another row that wasn’t yet created, we get a foreign key violation… Our strategy could be to insert all rows without setting foreign key field values, then do another pass to just connect the foreign keys. If we do this after all tables have finished their first pass (inserts, updates, deletes and all), we also support the circular references because required rows are present in all tables.

Ok, that was fairly easy to figure out (not much harder to implement either, but more on that later). We have another issue here that is not so obvious, deleting the rows… There may be other rows referencing the one we are deleting that haven’t yet been replicated. Since the Sync Framework applies the deletes first, we can be fairly certain that the referencing rows are yet to be replicated – they will either be deleted or updated to reference something else. So we can put a null value in all fields that reference our row. (Note that this will probably mark the other rows as modified and cause them to be replicated back – this is an issue I won’t go into in this post, but I’m quite certain there needs to be a global mechanism for disabling change tracking while we’re writing replicated data. I currently use a temporary “secret handshake” solution: I send a special value – the birth date of Humphrey Bogart – in the row’s creation/last update date fields that disables the change tracking trigger).

Ok, on to the code. I won’t give you a working example here, just sample lines with comments. You’ve probably figured out by now that it will be necessary to write SQL commands for the sync adapter by hand. I don’t know about you, but I’m no longer surprised by this: many of the tools and components we get in the .Net framework packages solve just the simplest problems and provide nice demos – if you need anything clever, you code it by hand. My solution was to create my own designer/code generator, and now I’m free to support any feature I need (also, I am able to do it much faster than Microsoft, for whatever reason: it took me a couple of days to add this feature… It may be that I’m standing on the shoulders of giants, but the giants could really have spared a couple of days to do this themselves).

For simplicity, I’ll show how to replicate a circular reference: there’s an Item table that has an ID, a Name, and a ParentID, referencing itself. For replication, I split the table into two SyncAdapters: Item, that inserts only ID and Name and has a special delete command to eliminate foreign references beforehand, and Item2ndPass, which has only the insert command – but the only thing insert command does is wiring up of ParentID’s, it does not insert anything. I’ve deleted all the usual command creation and parameter addition code, the point is only to show the SQL’s, since they hold the key to the solution.

[Serializable]
public partial class ItemSyncAdapter : Microsoft.Synchronization.Data.Server.SyncAdapter
{
	partial void OnInitialized();

	public ItemSyncAdapter()
	{
		this.InitializeCommands();
		this.InitializeAdapterProperties();
		this.OnInitialized();
	}

	private void InitializeCommands()
	{
		// InsertCommand
		// 1899-12-25 00:00:00.000 is a 'Humphrey Bogart' special value telling
		// the change tracking trigger to skip this row
		this.InsertCommand.CommandText =  @"SET IDENTITY_INSERT Item ON
INSERT INTO Item ([ID], [Name], [CreatedDate], [LastUpdatedDate]) VALUES (@ID, @Name,
@sync_last_received_anchor, '1899-12-25 00:00:00.000') SET @sync_row_count = @@rowcount
SET IDENTITY_INSERT Item OFF";

		// UpdateCommand
		this.UpdateCommand.CommandText = @"UPDATE Item SET [Name] = @Name,
CreatedDate='1899-12-25 00:00:00.000', LastUpdatedDate=@sync_last_received_anchor WHERE
([ID] = @ID) AND (@sync_force_write = 1 OR ([LastUpdatedDate] IS NULL OR [LastUpdatedDate]
<= @sync_last_received_anchor)) SET @sync_row_count = @@rowcount";		

		// DeleteCommand
		this.DeleteCommand.CommandText = @"UPDATE Item SET [ParentID] = NULL
WHERE [ParentID] = @ID DELETE FROM Item WHERE ([ID] = @ID) AND (@sync_force_write = 1 OR
([LastUpdatedDate] <= @sync_last_received_anchor OR [LastUpdatedDate] IS NULL))
SET @sync_row_count = @@rowcount";

		// SelectConflictUpdatedRowsCommand, SelectConflictDeletedRowsCommand
		// skipped because they are not relevant

		// SelectIncrementalInsertsCommand
		this.SelectIncrementalInsertsCommand.CommandText = @"SELECT  [ID],
[ParentID], [CreatedDate], [LastUpdatedDate] FROM Item WHERE ([CreatedDate] >
@sync_last_received_anchor AND [CreatedDate] <= @sync_new_received_anchor)";

		// SelectIncrementalUpdatesCommand
		this.SelectIncrementalUpdatesCommand.CommandText = @"SELECT  [ID],
[ParentID], [CreatedDate], [LastUpdatedDate] FROM Item WHERE ([LastUpdatedDate] >
@sync_last_received_anchor AND [LastUpdatedDate] <= @sync_new_received_anchor AND
[CreatedDate] <= @sync_last_received_anchor)";

		// SelectIncrementalDeletesCommand
		this.SelectIncrementalDeletesCommand.CommandText = @"SELECT FirstID
AS ID FROM sys_ReplicationTombstone WHERE NameOfTable = 'Item' AND DeletionDate >
@sync_last_received_anchor AND DeletionDate <= @sync_new_received_anchor";
	}

	private void InitializeAdapterProperties()
	{
		this.TableName = "Item";
	}

} // end ItemSyncAdapter
[Serializable]
public partial class Item2ndPassSyncAdapter : Microsoft.Synchronization.Data.Server.SyncAdapter
{
	partial void OnInitialized();

	public Item2ndPassSyncAdapter()
	{
		this.InitializeCommands();
		this.InitializeAdapterProperties();
		this.OnInitialized();
	}

	private void InitializeCommands()
	{
		// InsertCommand
		this.InsertCommand.CommandText =  @"UPDATE Item SET [ParentID] = @ParentID,
CreatedDate='1899-12-25 00:00:00.000', LastUpdatedDate=@sync_last_received_anchor WHERE ([ID] =
@ID) AND (@sync_force_write = 1 OR ([LastUpdatedDate] IS NULL OR [LastUpdatedDate] <=
@sync_last_received_anchor)) SET @sync_row_count = @@rowcount";

		// SelectIncrementalInsertsCommand
		this.SelectIncrementalInsertsCommand.CommandText = @"SELECT  [ID],
[ParentID] FROM Item WHERE ([CreatedDate] > @sync_last_received_anchor AND [CreatedDate] <=
@sync_new_received_anchor)";

	}

	private void InitializeAdapterProperties()
	{
		this.TableName = "Item2ndPass";
	}

} // end Item2ndPassSyncAdapter

In this case, it would be enough to setup the second-pass sync adapter to be executed after the first one. For circular references, I put all second-pass adapters at the end, after all first-pass adapters. Notice that the commands for selecting incremental inserts and updates read all columns – this is probably suboptimal because some fields will not be used, but it’s much more convenient to have all field values handy than to rework the whole code generator template for each minor adjustment.

UPDATE (24.11.2010): I’ve added a source file with an illustration for this solution. I haven’t tested it (although it does compile and may well work) but it could be useful in showing the overall picture. It was created by extracting one generated sync adapter from my application, hacking away most of our specific code and then making it compile.

http://www.8bit.rs/download/samples/ItemSyncAgentSample.cs

Note that the file contains a couple of things that stray away from standard implementation, like using one table for all tombstone records, using the sample sql express client sync provider etc. Just ignore these. One thing, though, may be of interest (I may even do a blog post about it one day): the insert command knows how to restore autoincrement ID after replication, so that different autoincrement ranges can exist in different replicated databases (no GUIDs are used for primary keys) and identity insert is possible. This is necessary because SQL server automatically sets the current autoincrement seed to the largest value inserted in the autoincrement column. Keep in mind that this (as well as the whole class, for that matter) may not be the best way to do things – but I’ve been using it for some time now and haven’t had any problems.

Entries (RSS) and Comments (RSS).