Menu

Tech blog (33)

New site, new technology, old blog

Just a short notice: We have migrated our site to different technology (from WordPress to Joomla K2) and new design. I think I managed to reconstruct all blog posts and comments so it should work as before. Of feed URLs I'm not sure, I added a redirection for the main feed but category and comment feeds are probably unusable, so please update to new ones. If you find something wrong with the blog, please contact us at office (at) 8bit.rs.

Building customizable applications in .Net

Most modern applications today have at least some degree of modularity and customizability. By customizability I mean its simplest form - having one “vanilla” application with standard features and many derived versions adapted with minimal modifications to customer’s needs. It is common practice to use dependency injection (DI) and with it we can influence the behaviour of our application by being able to replace one component with another. But DI alone does not provide everything needed to make the application truly customizable. Each aspect of an application needs to support modularity and customizability in its own way. For example, how do you maintain a customized relational database?

This post is intended to be the first of a series regarding issues and solutions we found developing our own customizable application, a kind of an introduction to the subject so that I can write about concrete solutions to concrete problems, as/when they pop up. Here I’ll give an overview of the general architecture, problems and possible solutions.

Our product is a .Net WinForms line-of-business application. I think WinForms is still the best environment for LOB apps as the others are still not mature enough (like WPF) or simply not suitable (like the web). I would say it’s also good for customizability because it doesn’t impose an overly complex architecture: customizability will make it complex by itself.

As I said, DI fits naturally at the heart of such a system. DI can be used for large-grained configuration in the sense that it can transparently replace big components with other components implementing the same features. It’s generally suitable for assembling interfaces and possibly bigger parts of the business logic. It’s not very suitable for a data access layer: even if you used a superfast DI container that puts no noticeable overhead when used with thousands of records, that would be just part of the solution. A bigger question would be how you can customize queries or the database schema. So, for data access we need a different solution, and I believe that this part of the puzzle is the most slippery since it’s so heterogeneous. Firstly, there’s usually the relational database that’s not modular at all: how do you maintain different versions of the same database, each with part of its structure custom-built for a specific client? (It would be, I suppose, a huge relief if an object-oriented database was used, but this is rarely feasible in LOB). Then, there are SQL queries which you cannot reuse/override/customize unless you parse the SQL. Then the data access classes, etc.

Get content from WPF DataGridCell in one line of code (hack)

How do you get the text displayed in a WPF DataGridCell? It should be simple, but incredibly it doesn’t seem it is: all the solutions given on the ‘net contain at least a page of code (I suppose the grid designers didn’t think anyone would want to get the value from a grid cell). But when you quick-view a DataGridCell in the debugger, it routinely shows the required value in the “value” column. It does this by calling a GetPlainText() method, which, unfortunately, isn’t public. We can hack it by using reflection – and, absurdly, this solution seems more elegant than any other I’ve seen.

DataGridCell cell = something;

var value = typeof(DataGridCell).GetMethod("GetPlainText", 
System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.Instance)
.Invoke(cell, null);

CruiseControl.Net Missing Xml node (sourceControls) for required member (ThoughtWorks . CruiseControl . Core . Sourcecontrol . MultiSourceControl . SourceControls).

It’s a silly error but the solution is not very obvious or logical… I modified my CruiseControl.Net configuration to include multiple source control nodes, but it started complaining that the XML was malformed. The error was something like

[CCNet Server] ERROR CruiseControl.NET [(null)] - Exception: 
  Unable to instantiate CruiseControl projects from configuration document.
  Configuration document is likely missing Xml nodes required for properly populating CruiseControl configuration.
  Missing Xml node (sourceControls) for required member (ThoughtWorks.CruiseControl.Core.Sourcecontrol.MultiSourceControl.SourceControls).
  Xml: <sourcecontrol><sourcecontrols><hg><executable>C:\Program Files\TortoiseHg\hg.exe</executable> […]

It complains of a missing sourceControls node for required member SourceControls – but it’s present in the xml. The problem is the capital “C”: 1. XML is case-sensitive. 2. CruiseControl config tags are inconsistent in that sourcecontrol is not camel-cased but sourceControls is. That’s what confused me - hopefully this post will help someone else.

Workaround for HQL SELECT TOP 1 in a subquery

HQL doesn’t seem to support clauses like “SELECT TOP N…”, which can cause headaches when for example you need to get the data for the newest record from a table. One way to resolve this would be to do something like “SELECT * FROM X WHERE ID in (SELECT ID FROM X WHERE Date IN (SELECT MAX(Date) FROM X))”, a doubly nested query which looks complicated even in this simple example and gets out of control when query conditions need to be more complex.

What is the alternative? Use EXISTS – as in “a newer record doesn’t exist”. It still looks a bit ugly but at least it’s manageable. The above query would then look like this: “SELECT * FROM X AS X1 WHERE NOT EXISTS(SELECT * FROM X AS X2 WHERE X2.Date > X1.Date)”

Note that this works only for “SELECT TOP 1”. For a greater number there doesn’t seem to be a solution at all.

How to create a temporary table that can be seen by different SqlCommands on the same connection?

The unexpected answer (that I learned the hard way) is: well, it depends on whether you have parameters on your command or not. The point being, if you execute a parameterless SqlCommand, the sql gets executed directly, the same way as if you entered it into the query analyzer. If you add a parameter, the things change in that a call to sp_execsql stored procedure gets inserted in the executed sql. The difference here is the scope: if you create a temporary table from within the sp_execsql, it's scope will be the stored procedure call and it will be dropped once the stored procedure finishes. In that case, you cannot use different commands to access it. If you execute a parameterless command, the temporary table will be connection-scoped and will be left alive for other commands to access. In that case, the other commands can have parameters because their sp_execsql call will be a child scope and will have access to parent scope's temporary table.

As to why they did it this way, I can't say I understand.

Lessons learned: migrating a complicated repository from subversion to mercurial

Migrating from SVN to Mercurial is a simple process only if the SVN repository has a straight-and-square structure - that is, there are trunk, branches and tags folders in the root and nothing else, not in its present state or ever before. If you used your SVN repository in a way that was convenient in SVN but not in Mercurial – for example, you created branches in various subdirectories, it still shouldn’t be too hard to migrate. But if you, like me, decided late in the game to create the mentioned folders in SVN and then moved an renamed your folders, you will need to invest serious time if you don’t want to lose parts of your history. You need to plot your migration very thoroughly and do a lot of test runs.

The reason for this is that Mercurial’s ConvertExtension is somewhat of a low-level tool. (In other words, although reliable it is not too bright). Browsing the internet you may get the impression that it’s an automated conversion system: it isn’t. It does fully automated migration only for straight SVN repositories, but for the rest it’s more like something to use in your migration script. It seems to do its primary purpose – converting revisions from one repository format to another – quite well but the rest of the tool is not so intelligent and it needs help. So, lesson number one: if you have a complex repository, don’t take the migration lightly.

A small disclaimer is in order: this post is not intended to be a complete step-by-step guide to migration. Rather, it’s something to fill in the blanks left by what little is available on the internet. I’ve done a complex migration and I want to do a brain dump for my future reference or for “whomeverother it may concern”.

Between the two alternatives I perceived as most promising, hgsubversion and the convert extension, i chose the latter. Hgsubversion was claimed by some to be the better tool for this job, but it was somewhat troublesome. The problem with hgsubversion was that it had a memory leak and broke easily in the middle of the conversion (note that this happened a couple of months ago: things may have changed in the meantime). The solution, they say, was to do hg pull repeatedly until it finishes. I wanted to do a hg clone with a filemap, but when the import broke I was in trouble because hg pull doesn’t accept filemaps. (It could be that the filemap was cached somewhere inside of the new repository and my worries were unfounded, I don’t really know). I may try that in the future. One other way around it would be to do a straight clone of SVN – no branches or anything – into an intermediate mercurial repository and then split that into separate final repositories. In that case, hgsubversion could be a viable solution, maybe even better than the conversion extension. I had more success with the conversion extension so this is what we’ll talk about here.

Moving Up from SVN: Experiences in Migrating to Git and Mercurial

(Note: this text is a result of a couple of months of research… Only when I finished the migration and got to the point where things are running smoothly I got around to finishing this post. Some of this information may be a bit outdated, but from what I’ve seen things haven’t moved much in the meantime. Anyway, I added comments in the text where the situation might have changed).

Now that I think about it, there was one crucial reason to abandon SVN and upgrade to something more powerful: merging. My company now has a lot of parallel development (several customized instances of an application), and SVN’s support for this is not as good as it should be – that is, not as good as the competition’s. In SVN you can separate a development branch and keep it parallel to the main one, and merge-reintegrate it back. SVN will remember what revision it merged so that it doesn’t try to do it again (which would otherwise produce merge conflicts). But that seems to be the limit of its capabilities: a bit more complicated structure and merge produces so much conflicts that it looks the same as when everything is done manually. Contrast that to a more recent VCS like Git or Mercurial, where the essential feature is the ability to manipulate changesets, stack them one upon another or restructure their connections: if you can merge a set of changes from one branch to another, then continue developing both of them, do a back-merge (even though you shouldn’t), and the system doesn’t produce an enormous amount of conflicts, it gives you more power.

Of course, there are additional advantages that both Git and Mercurial give over SVN, but in our case that was just a bonus, not a necessity. I believe that many developers think likewise. SVN is good enough at what it does, the problem is when you need something that it wasn’t designed to do.

So which one to choose? It seems that Git, Mercurial and Bazaar are the most popular. People say that of the three Git is the most powerful because it was written by Linux kernel developers who do complicated merges every day. Ok, so I seemed natural to choose that one.

Git

The first impression that I got is that Linux kernel developers are guys who talk between themselves in hexadecimal and to the rest of the world in mnemonics. Git had some documentation but not nearly enough explanations on how things work. The error messages the tools produced were cryptic – and not only that, they were formatted to be viewed in the console so when they are displayed by GUI tools they tend to get messed up. Ok, so Linux kernel developers think GUIs are lame… While we’re at it, error messages are also misleading: I spent a lot of time trying to diagnose what’s wrong with my repository because the clone operation kept producing a warning about the remote head pointing to a non-existent revision or branch (and yeah, it was a fatal warning – a new term, at least for me - because the command didn’t do anything but gave only a warning). It turned out at the end that I was using a wrong URL for the repository – that is, I was trying to clone a nonexistent repository. I agree, in a nonexistent repository the nonexistent head points to a nonexistent revision, but it’s a bit of a roundabout way to say it, isn’t it?

But these are the things that can – and most probably will – be addressed as the tools mature. Since there are so much Git users, it’s surely reliable enough. I don’t mind some rough edges, I’ve been there with SVN and it was worth it (it was much better to cope with occasional bugs in SVN than to start with CVS and then migrate to SVN).

Ok, so how do we get Git to work on a Windows server? The answer that seemed complete enough hinted that I should simulate a Linux environment, install the shell, something like a SSH daemon and everything else. In other words, it’s not supposed to run on Windows but it can. Ok, I tried – there are several variations to the theme and each one had a glossed-over part that needs to be researched (something like “at this point you need to import the created certificate into Putty” – well, Putty doesn’t want to read this file format). And it didn’t help that Git itself doesn’t always tell you the real reason something doesn’t work, as I already mentioned. Moreover, it was for some reason allergic to empty repositories – unless I commit something locally, it won’t work. And the repository got easily screwed up while experimenting – that is, it got into a state where I didn’t know what to do with it and it was easier to create a new one.

At this point it was clear that the client tooling also leaves a lot to be desired – there was TortoiseGit that locked occasionaly on clone/pull (and it never displayed the progress of the operation so you never really knew if it was doing something), there was Git GUI that was a bit more stable, and there was Git Bash that was the most reliable. (One interesting sidenote is that at one point I managed to get three different error messages for doing the same – clone - operation with these three tools). One thing, though, the Bash in Git Bash is probably the best part of the package, I had almost forgotten the comfort of working in a Unix shell. Command prompt is light years behind it.

I did get the server to work, though, after a week of running around in circles and repeating the same stuff with slight variations. At the end I was able to create a new repository, clone it, make changes, commit, push, pull… Everything worked until I restarted the server. Or so it seems – when I tried to clone the same repository afterwards, it started again producing weird error messages. I didn’t know what else changed except for the restart (which shouldn’t have affected it – and probably didn’t, but what else? It’s an isolated environment). If I didn’t have the cloned repository with revision history I would have doubted I actually succeeded in doing it… Ok, so it’s also fragile.

Then I tried to find an alternative: there’s a couple of PHP scripts that wrap Git server functionality, but they don’t seem to work on Windows (I tried them in Apache). There’s Bonobo Git server that is written in .Net – well, I never looked at it seriously, how’s a .Net Git server going to work when their own true-to-the-original-idea SSH configuration doesn’t? But it does work – it also needs a bit of tinkering (you have to Google-translate a page in Chinese to get the info on how to really do it, WebDAV etc.) but the installation is amazingly painless: it takes a couple of hours, which is nothing compared to a week wasted for the previous experiment.

So, on to the next step: migration. I migrated a test folder from SVN with no trouble. Tried a bigger one, something under a 1000 revisions – well, it finished in a couple of days, I suppose it’s tolerable. Finally, tried to convert the central project – and when after two weeks of non-stop import it managed only 2000 of the repository’s 5000 revisions, I gave up. Back to Google: why is it so slow, how to speed it up? Turns out that the Git client is developed on Windows XP and that it should probably work well there. As it did: it managed to get all 5000 revisions in a couple of hours. Ok, now this I didn’t like. How can a serious tool not work on newer Windows versions? They said, it’s slow because of the UAC (introduced on Windows Vista), the UAC slows everything down. Well, it’s not like Vista was released yesterday. If this problem exists for years, should I expect a solution ever to appear? More research hinted that Linux kernel programmers think Windows users are lame. So – Git was slow on Windows newer than XP. TortoiseGit seems to execute the same Git shell commands, so it’s the same. I found Git Extensions in the meantime, which is supposed to be independent – but it didn’t even handle HTTP repositories.

In the meantime, I tried cloning the big repository I converted – big as in 200 megabytes big – and it was, as expected, slow. But, I don’t really know which one was to blame here – seems like Bonobo server choked on the large repository since it nailed the CPU at 100% and produced around 40 bytes per second to the client (possibly it was just some kind of sync messages and no data at all). Ok – Bonobo is open source and was built around GitSharp (or something like that) Git library written in .Net. What if I tried myself to update the library and compile the server? Well – GitSharp is discontinued at version 0.3. They’ve all gone to some new library written in C.

Ok, that was enough. After three weeks completely wasted, I gave up on Git.

(Update: Bonobo was since upgraded to version 1.1. I looked at the change log hoping to see a note about moving away from GitSharp, but it didn’t seem to happen. So as far as I know, this performance issue may still be present – nevertheless Bonobo seems the most promising solution for a Windows Git server).

Mercurial

So? Should I look at Mercurial? The Mercurial supporters seem a bit shy – that is, compared to Git fanatics who shout GIT! GIT! GIT! GIT! at each possible occasion, there occasionally appears one that says “or, you could try Mercurial”.

Well, the first look revealed the difference: I can download everything I need from a single web page. Mercurial basic installation – right there, server support included (as a script that is installed in a web server). TortoiseHg, right below it – even it has an ad-hoc server built in! Python extensions – do I need this? So I thought – no, this is too easy. Let’s try something unheard of in Git – RhodeCode. It’s a server implementation that is in fact a full-featured web application. Seems very nice, but due to some incompatibility in a crypto library, very hard to get installed on Windows: it took a lot of workarounds, I ended up installing Visual C++ Express 2008 (it has to be 2008, 2010 doesn’t cut it) and another kind of simulated shell environment (MingW32) to try to get the installer to compile it from source but it was impossible. That is: impossible on Windows 2008 R2, 64-bit. The RhodeCode developers say they’re working on making it more compatible with Windows (and for one thing changing the crypto library), and I found that I believe them, so I’ll be coming back to it. (In the meantime, they’ve released a couple of versions with windows-specific bugfixes, it might be worth it to check it out again).

In the vanilla Mercurial instalation there’s a script called hgweb.cgi that contains the full functionality needed to get a server running. A bit of tinkering is needed to make it run inside IIS – and there are a couple of slightly outdated tutorials on how to do this. I found out that the best combination is to download Mercurial for Python - so, no Mercurial or TortoiseHG on the server. This download says in its name for which version of Python it was written, and that version of Python is the second thing needed. Once both are installed, it is sufficient to put the hgweb.cgi in a virtual server in the IIS, add a hgweb.config and a web.config file, configure the IIS (basic authentication and what have ya), and set permissions on the folders – including the repositories. It took less than one day, research included, to get it up and running.

The client tooling seems better than SVN. TortoiseSVN (in fact, TortoiseCVS) was a breakthrough idea – a VCS integrated into windows, allowing you to version-control any folder on your disk. Well, TortoiseHG went one step further and actually improved the user experience. It has its bugs – more than SVN – but also has a lot more features. The whole thing was written in Python and seems to have a good API because a lot of plugins have been written for it, and TortoiseHG includes the most important ones. At this point I had to install TortoiseHG on the server because that’s the only way to get the subversion converter plug-in. The other way would be to install the Python module, but it cannot be done: first of all, they’ll tell you to install Subversion for Python (which is quite simple, there’s a pre-packaged downloadable setup), but when you do and get an error from the convert extension, you’ll find out that you don’t need that package but something called SWiG. But SWiG doesn’t have anything to do with Subversion – you have to download the Subversion package from Subversion site which moved to Apache leaving the Python part behind and the best you can do is find a source somewhere and compile it, but nobody says how it’s done.

On to converting the repositories – for one thing, it’s speed is normal, on any Windows. As fast as Git on XP, maybe even faster. So I was encouraged to do the thing I never even got around to thinking about with Git – and that is splitting the repositories into individual projects. It did take a week – the details of it will be a subject of a future post – but in the end it produced much less pain then Git.

Conclusion

Looking at it now, I think that Mercurial is unfairly underrated, and this seems to be due to the loud chanting of Git proponents. They say Git is blazingly fast – well, if you calculate the average performance on all operating systems, I think that on the average it’s either very slow or Windows users don’t use it at all. On Windows XP, Mercurial is at least as fast as Git, and on newer Windows versions Git is usable only for small repositories. Git is largely badly documented (which is getting better but Mercurial is way ahead – suffice it to say that I understood some Git concepts only when I read the Mercurial documentation). Git tooling, generally speaking, sucks – it was designed to be used in a Unix shell and nowhere else – and on Windows it is total crap. On the other hand, Mercurial tooling on Windows is, after this experience with Git, impressive. My conclusion is that for Git I would have to require special skills when employing programmers - “C# and Bash knowledge required”, how sane is that? Ok, I’m joking but it’s not far from truth: there has to be at least one Git specialist on the team when ordinary developers get stuck. With Mercurial, the usual SVN-like skill level should be enough for all because it’s not that easy to get stuck. And, after all this, I’m inclined to think that the story about Git being so great should be taken with a grain of salt since everything I’ve seen so far from it seems to tell exactly the opposite.

Subscribe to this RSS feed