I’ve been wonder for some about about metrics to evaluate the relative architectural cleanliness of various database implementations. To that end, I wrote a simple program that eat Visual Studio 7 projects files and analyzes the source files. Here are the results:
|Nfs Engine||Vulcan||Firebird 2||MySQL Server|
|Average Code Lines||11.80||21.20||37.12||26.90|
|Average Internal Comments||0.94||6.10||11.92||2.59|
|Average Internal WhiteSpace||2.12||5.16||6.92||2.21|
The analysis program doesn’t try to follow conditional compilation, so everything is included whether active or not.
The Netfrastructure engine is roughly equivalent in functionality to Firebird. The Netfrastructure numbers, however, are for the database engine only, excluding the Java Virtual Machine and template engine. Since the trigger and procedure language in Netfrastructure are Java, this isn’t a strict apples to apples comparison. On the other hand,the Netfrastructure engine includes the remote server, which Vulcan does not.
The Vulcan numbers are taken from the engine provider current code base. A small number of modules that, due to conditional compilation, couldn’t make it through the analysis program were omitted. Post-processed modules were also omitted. Since Vulcan contains quite of bit of archival, disabled Firebird code, its numbers are slightly bloated.
The Firebird number are taken from “engine” msvc7 project. Since Firebird 2 doesn’t use custom development steps, the preprocessed modules aren’t included (the project hasn’t been built, so the corresponding post-processed modules are not included either. I don’t actually know what is the Firebird 2 engine build, but I assume it doesn’t include DSQL and possible other common stuff.
The MySQL numbers are from their Windows source kit. I believe that they also use static libraries for cross component modules, so I suspect this is less than the full server. But it does give a feeling.
I think the two most interesting sets of number are the average number of arguments per function and the average number of code lines per function (code lines exclude comments and white space). It is most interesting that in each case, Vulcan falls halfway between Netfrastructure and Firebird 2. The average number of arguments is a good metric of the quality of a design. Bad (or in this case eroded) designs have to pass everything but the kitchen sink, and sometimes that. The Firebird 2 numbers are particularly scary because many additional parameters are passed covertly through thread data. The average code lines per function is a good metric of modularity — the degree to which common code is cleanly factored out.
The comment related metrics are substantially misleading since they are computed relative to number of functions rather than code lines — fewer code lines will always mean fewer comments. Even so, it is clear that Firebird has something to teach me and MySQL about internal commenting.
Both the ProjectAnalyst and ProjectsSummary projects are checked into the Vulcan tree under src. If you want to play or analyze Firebird 1.0 or 1.5, I’d like to see the results. You may also want to add more metrics. ProjectAnalyst generates xml (sans header) summary files, ProjectsSummary turns a set of xml summary into an HTML table.
-- Jim Starkey Netfrastructure, Inc. 978 526-1376