Sunday, April 15, 2012

Trying to be cute harms your startup

Yet another MongoDB user tosses it under the bus.  This Hacker News thread has more info.

Their article starts off with "This week marks the one year anniversary of Kiip running MongoDB in production"

It's April, 2012.  I just have to wonder, did the guys at not read all of the criticism of MongoDB in April, 2011?  Urban Airship had already said they were moving back to Postgres.  If that's not early enough, how about April, 2010?  I tried using Mongo at the beginning of 2010 and exchanged private emails with their dev team about virtually all of the concerns listed in their blog post in January, 2010.  Who at missed the memo that Mongo is using system paging for persistence, doesn't have durability by default and has a global write lock?

But I guess what we can just say here is that yet another startup tried to be cute, then ended up their time dealing with the cute new tech that broke.  When I say "try to be cute" -- I mean that startups choose this tech because they think it attracts recruits and attention, or saves them time to market because it's "schemaless".

Yet, has anyone heard of other than people reading their post about Mongo?  I haven't.  That seems like a sucky kind of failure.  They spent their time wrangling Mongo instead of building a product that got their name out there because of the product itself.

From time to time on HN or elsewhere, people ask "what tech should I use".  The answer in my mind is "Java, Mysql, Apache/nginx".  So here goes:

Just use Java, MySQL, Apache, for everything web and server related*

(I'll accept the following substitutions:  C# (for you Windows folks), Postgresql/Sql Server/Oracle for Mysql, Nginx for Apache.  If you have to be all dynamic and stuff, CPython 2.x.)

It's so boring, I know.  As many of my faithful readers know, I've historically been the one out there trying out all of the new hotness.  How can I be the one proposing to use Java Beans?  XML configuration files for Spring?  Barf.  It makes your startup less cool because you're using such boring technology.  Right?  RIIIIIGHT?

You know what's cooler than Scala, Clojure, node.js and MongoDB?  A billion dollars.

On the languages:  LinkedIn, Google, Netflix use Java.  Instagram, Youtube, Slide, Dropbox use CPython.

Databases:  Adwords, to this day AFAIK still runs on MySQL.  Facebook, MySQL.  Slide, MySQL.  Youtube, MySQL.  Stop saying MySQL doesn't scale.  Scaling is hard no matter what you use.  MySQL is a well-known quantity, good and bad.  Just use it.  Or use PostgreSQL, which is what Instragram did.

Web servers:  apache and nginx serve 99.9% of everything unless it comes from Microsoft.  So just use them instead of trying to be cute.

Beyond just web apps, given my experience with Java over the past year, where I chose it for a non-web-server project at work, I'd default to using it for any server-related thing I needed to write.  Only in extreme circumstances would I go with C++ or C -- probably limited to writing front end web server (e.g. nginx) or database.  Even if I was writing an MMO, I'd use Java for the entire server stack.  I know my game friends will laugh at me for that, but I'll stand by that claim.  It's so boring and yet so functional and fast.

On the topic of "schemaless"

I'm tired of this development-trope of "schemaless" databases like Mongo.  I'll let you in on a secret:  There's no such thing as "schemaless."  

You have a schema somewhere, whether you like it or not.  Your code defines it, or your database defines it.  If you store to disk with protobufs, your IDL defines it.  The other day on HN and on Twitter, I postulated that on a long enough timeline, the probability that your data needs to be accessed by more than one application goes to 1.  If you think just an API can deal with this, consider if you wrote a monolithic application that you now want to split into separate services in a new framework/language.  Suddenly, you're doing a ton of code refactoring where a database schema and views could have solved it easily.

RDBMS were developed over the last 50 years to handle this exact scenario.  Views, triggers, stored procs, constraints:  that's what they're all for.  If you take the same seriousness to developing a database as you do towards developing code, this can all be clean, efficient and manageable.  Don't like doing it yourself?  Hire a DBA who has a clue.  But this whole tired schemaless thing is ridiculous.  If you think schemaless is good, build your application just storing JSON to individual files on disk and let me know how it goes.  It's essentially the same.