We've been using SQL 2005's XML functionality on a project because the people entering data wanted the luxury of freeform data in XML. This has turned out to be pretty good, although we could have taken it even further. These databases allow you to insert and replace Xml nodes right in the document. Of course, this is nowhere near as fast as using SQL, but the flexibility is pretty great. Anyone who has had to deal with changing schemas on a DB should be familiar with this. If your data is never locked down in the first place, it turns into a nightmare.
I've been reading for a while that DB2 supports Native XML better than SQL 2005, so I decided to give both a try this weekend and see how they compared.
DB2 Express-C v9.1
SQL Server Express 2005 SP 2
- Single row with 6mb of XML, queried via XQuery
- Same XML file shredded into ~6000 rows in a table with an XML column, queried via XQuery
- Traditional primary key test.
- No additional tuning except creating XML indices.
- All queries done through ADO.NET for consistency.
- Deployable on Linux, AIX, Solaris, Windows, etc.
- Consistently the best performance at doing Xquery across shredded XML.
- XQuery with index across shredded XML better than or equal to typical sql primary key!
- Consistent query speed even without an XML index.
- Inserts are very fast.
- XML indices can be set up for individual XPaths, targeting exactly what you need.
- XQUERY is a direct command that can be used, rather than querying via SQL functions.
- Index on 6,000 rows of XML did speed up XQuery across them.
- Still works well with ADO.NET out of the box.
- I was able to crash the entire database... yes, I got a C++ runtime error dialog for db2syscs.exe, when I tried to delete a couple columns from an existing table. Nothing fancy, and the DB was completely hosed at that point and I couldn't even stop or restart the thing using their tools. I had to go into Task Mgr and kill all DB2 processes by hand and restart. Obviously, this is inexcusable for a shipping database product.
- DB2 never ever shuts down gracefully for me. It always complains there are still connections even when there aren't. Back to task manager to kill it!
- XQuery on 6mb XML blob was never very impressive, speed-wise.
- Setting up XML index this seemed to have no effect on my 6mb XML file in a single row speed.
- Documentation is a near-unusuable collection of one page descriptions with no samples.
- Java-based database management tools suck hard. It's barely usable and has a huge memory footprint to boot. Oracle's Raptor tool is better, though, that too being in Java, it is only slightly better.
SQL Server 2005 Pros:
- Better command cache and query cache. Doing the same query twice results in queries taking microseconds rather than milliseconds.
- Fastest at XQuery within one 6 megabyte row of XML data.
- Far better documentation.
- Far better tool for database administration (SQL Server Management Studio Express)
SQL Server 2005 Cons:
- Poor performance at xquery, using exists() across shredded XML.
- Windows only.
- Ability to push XML straight from DB to XSLT, if you're into that sort of thing.
- Did I mention flexibility?
- Typically slower than traditional primary keys (Except on DB2).
- Never got SqlParameters feeding into Xqueries for neither Sql Server nor DB2.
So after all that messing around, I decided to do a little control study with MySQL, which has no XQuery or native XML support. In order to weed out the hype of "Native XML", I thought this necessary.
Obviously it's tough to navigate the large 6MB blob of XML on MySql, so I decided to put it up against the shredded XML using "WHERE xmldata LIKE 'lots of percent sighs'. And actually...
- MySQL outperforms SQL Server when working on shredded XML. (by an order of magnitude)
- MySQL underperforms DB2 when working with shredded XML. (by an order of magnitude)
Sad, but true. Microsoft's XQuery hype gets killed by using "WHERE ... LIKE" in MySQL on shredded data.
Based on this, here's a guide for while DB to use:
- Non-shredded, huge XML blobs: SQL Server 2005.
- Shredded, XML in rows: DB 2.
- Pure read performance: MySQL.
- Best in show: DB 2.
I wish I didn't have to write that DB 2 was the best in show here given the crashes and trouble I had with it, but: in general, if you need to use XML in your database, DB2 is probably the best choice.