Sunday, July 27, 2014

Giving up on Blogger

You can find me here from now on. It's a Tumblr blog for now.

Why, you ask? Because Blogger has fallen into disrepair. Google does not care about it, so I won't tolerate it anymore.

Here's what my draft of my upcoming post looked like when I opened it back up a few hours after leaving today:

This is just once instance in a long history of editorial eff-ups this software has had in the 8 or so years I've used it.

Later Google, hello Tumblr.

Sunday, July 13, 2014

Immutable data performance

"Immutable" is a recent discussion point around the water cooler, and it's not my imagination when I say "recent":

I think this comes from the functional programming rage -- especially in Haskell -- where, since all of your data is immutable, every function is composable and concatenatable since there are no side effects. Neat. It reminds me how people did this years and years ago to make image manipulation much faster with tools like Shake.

But since that's gotten so much hype over the last couple years, people are now thinking immutable data is a good idea in all languages in all situations. It's not. Immutability can hurt performance significantly.

Take this really really simple example in Python:

I tried this with PyPy as well. It made no difference. The "immutable" version takes about 2x as long as the mutable version" (9.682s vs. 5.312s on my machine in the most recent test)

Same thing in Java.

This comes out to be 7.317 vs. 12.736

Long story short, I don't know much about Haskell, but without an optimizer that can concatenate similar functions, immutable data is going to mean allocation. And allocation can kill your performance. I recently heard a work story of someone having a performance problem due to overuse of list comprehensions in Python. I don't know enough to chalk that story off to this trend, but it fits.

FWIW, there's a reason that languages with mutable data like C++ and C# use "out" and "ref". You can have good performance and good self-documentation as to what your function is going to do with that data. Without having to move to a language that only works on data in immutable fashion.

Sunday, June 01, 2014

Why should people get paged at night *ever*?

Someone over at Etsy posted a nice "Sleep Driven Development" article and it brought to mind my personal jihad against pager alerts.

There are a handful of major, well-known tech employers that adhere to "DevOps" or "NoOps" practices and have all engineers on pagers. If you ask exiting engineers at these certain companies what they don't like about working there, it is very often mentioned that being on-call is one of them. An additional trend I've noticed is a strong relationship with being on-call and short tenures at companies.

My first foray into being on pager duty was the email system at Groupon. When I got there, this was using a third party email sender and a really ill-designed process for getting 100% customized email through to our users based on relevance. A lot of that process had come from consulting companies (note to self never have consultants work on systems that ultimately make you responsible on pager duty). Anyway, pretty much every night I got a call from Indianapolis that something broke. Every night. I had to move out of my room so as to not wake my wife. I still have PTSD when I get a call from a 317 area code.

My goal at that job became to build a system that was so reliable that no one would ever get paged, ever. I can still mentally think of the SPOFs in that system and how I wanted to get rid of them (I left for another opportunity before getting the chance).I really hope that today, the people working on that system never get paged and it "just works".

Anyway, therein began a process of my trying to destroy pager duty forever. Once a company gets to a certain size, there's no reason that anyone should even have to carry a pager. The system should be so redundant that failovers are completely automated until pingdom fails. Then someone gets called.

Yes, there are times when a company is small that you can't manage redundancy like this. But once you hit the threshold, spend the money for this. There's no reason not to. Most of the time what we're talking about here is a website. Hypertext over port 80 for god sakes. We're not talking about the primary heat exchanger on a nuclear reactor in a tsunami zone. Spending money on redundancy will not break the bank. Spend as much as you have to on technology to make failovers seamless to the point where someone can come in in the morning and see a list of what failed and needs fixing.

Ultimately if you don't spend the time, effort and money to make your systems redundant, all you're going to do is burn your engineers out until they leave. You asked them to do "devops" or "noops", then don't spend the money or time to make it so they won't need a pager.

Though, another telling type of engineer is the one who designed and built the system that's now paging him all the time and causing him to leave. Therein should be a red flag for hiring, am I right?

Friday, May 23, 2014

"Premature optimization" doesn't mean what you think it means

Junior developers often seem want to discuss "premature optimization" with me when I bring up things such as scalability and performance. For those not familiar, here's the entire context of Knuth's quote.

He was talking about gotos! People were using goto statements to improve efficiency and he felt that it was possible to get nearly same efficiency without using them too early when coding.

This is entirely different than what Hacker News, Reddit et al. have mangled this into.

What people believe he meant is that efficiency is not an important concern early in the lifespan of a program. What he actually meant is that major tradeoffs for readability using gotos for efficiency often end up not being the win you thought they'd be. Specifically. Goto statements.

There are basics towards building successful scalable systems that people missing by taking Knuth's quote at face value.

  • He's saying that anti-patterns of development are not worth the tradeoff for efficiency. Not that spending time considering efficiency is not inherently valuable.
  • Performance as a requirement. It is one. This is the most common thing I've seen people overlook. How is it that systems backing major websites are designed and built without any discussion towards how many requests per second that system will serve right away, in a year, or in 3 years? How is it that a latency target is not considered? That's simply gross negligence that Knuth was not advocating.
  • A system that doesn't scale early won't scale later. This is probably the most overlooked thing I've seen. If you design a system that is not fast when you are throwing one one-hundredth of the traffic at it, how do you expect it to behave when you scale it up? A system should be incredibly fast with no traffic, and if it isn't, you won't scale. I've seen this too defended with "premature optimization"

Most of the time I think the "premature optimization" quote is used to defend badly designed systems that happen to follow a pattern someone would like to use instead of taking performance as a requirement into consideration. Maybe they want to use node.js, and the initial performance shows it's terrible. This is more often what I see defended by "premature optimization": not following through on the basic requirements of a scalable system in order to preserve a developer's choice of language or architecture.

Sunday, May 04, 2014

On Scaling Code and Static Typing

Seemingly legit question on proggit just now:

"Out of curiosity, is static typing really that large an advantage? Yeah, I get that run-time errors could be worse than compile-time, but isn't that something that oughtn't to get past testing?"

The answer is yes, static typing is a huge advantage.

When you start out in software development by writing hobby projects, hobby websites, or code for school, the code bases are not very large. It's easy to gather a belief that that anything under the sun can be done with your favorite language because they're so easy to get started and work with by yourself or a few people.

But I've postulated for several years now that there's some number -- call it n -- that, upon reaching that number of lines of code, modules, or whatever metric you want, your organization cannot effectively scale its code. N depends on a lot of things. Tooling available, structure of the code, experience of the devs, the language itself, the institutional knowledge that needs to be passed on through code (because of high churn), and a bunch of other factors we probably can't even name.

It's impossible to say what n really is as a hard number applicable to all organizations. Maybe it's 50,000 lines of code for a new grad or maybe it's 50,000,000 if you have a super experienced, senior team that's been working on that code for 25 years.

But what is possible to claim is that n is larger by default for some languages than others. A larger number simply means that an engineer does not require Total Code Awareness to do work.... thanks to abstractions offered by the language and its tooling. 

Static typing makes a huge impact towards that end. It:

  • Allows for more sophisticated IDEs, code browsers, static analysis and so on.
  • Allows abstractions to be more thoroughly vetted before code is executed.
  • Better documents the intent of the original programmer.

From the get-go, these can make a huge impact on what n can ever eventually be. Add onto that the best practices that have been established for already-scaled languages and n is even higher.

Yet, all companies, with all languages, will face having to raise n at some point no matter what language they've chosen. This can be for non-code reasons due to churn, due to personnel talent, customer demands or whatever. (Assuming of course their company does need to scale, which would be a shame if not).

However, those that have implemented in a language with a low initial threshold for will ultimately have to work more to increase that threshold. Each additional change requires more effort, often lacking a standard in doing so because not many have gone past that threshold.

An example that I've now seen at two companies that choose to use dynamic languages is using test running and test writing to scale. Scaling the organization becomes an effort in mandating code testing and then working to raise the number of tests that can be run. The tests are essential because without Total Code Awareness, it's the only way someone can make a change and have confidence they didn't break something else. This is not true for a statically typed language. Tests are one of the ways people can have that confidence. The compiler and static analysis are other ways.

As a result, these companies have had tens of thousands of tests that take 45 minutes to an hour to run. The tests become extremely brittle because, without that brittleness, no one has confidence that things won't break. Ultimately, to raise n, the company has traded off what a compiler can help with by forcing people to write lots and lots of code and run that code every time a change is made. This will take longer than something like incremental compilation ever would.

Raising n is why we've seen larger organizations run away from mainstream dynamic languages for code that needs to scale. Python is all but dead at Google. Facebook has added their own type checking to PHP at this point. Twitter scrapped Ruby for Java and Scala. And so on.

So there's your choice. You choose a threshold when you pick a language. Going above that threshold means a lot of work in the future. Choosing a language with a high initial threshold can help a lot down the road. It's up to you to determine what n works for you. Personally, I prefer static typing for all of these reasons, because I don't like working on small projects, and I believe it increases n dramatically.

Monday, April 28, 2014

Don't be afraid of code

Two scenarios of fear to discuss today.

#1. The Legacy App.

It was designed 10 years ago. It grew organically. No one wants to touch it out of fear. The code languishes as everyone searches for a way to work around the beast. Things don't get fixed. Silver bullet syndrome takes over. People would rather leave the company than fix it.

#2. The Low Level Solution.

Things are slow. The most direct solution would be to just write a little C or C++. But that is not clever. It's not smart. C is dangerous. It's unsecure. Surely the Right Way would be to use a stronger typed language or a clever hack distributed across machines horizontally. We prefer it should be unused-language-X, which is Safe and Correct, not C with is Unsafe and Incorrect. It would never be to write C, which is only used by hardcore kernel and game programmers. Web programmers can't use C. That's too hardcore.

These fears are bunk. These are illusions that prevent real work from getting done. It's time for managers and programmers to own up to them and get a handle on them.

* You need to fix the problem directly.

There's just no sense in indirect fixes. Indirect fixes are hacks. Or, worse, ineffective at solving the problem. If you have a legacy app that's a problem, you need to schedule the refactor. If the best fix is tackling a problem in C, then that's the best fix. Working around all of your problems is ultimately just fooling yourself.

* Be like a brain surgeon.

You know how brain surgeons know they're doing the right thing in there? They keep the patient awake and talk to them. They don't know if they're doing the right thing unless they pull on the nerve and see what it does.

Code is like that. It's an inexact science. No one can have total code awareness once it gets to a certain size. It's impossible. You need to use the tools (if your language has them) or simply be okay with breaking some eggs to make changes.

And you can't be afraid to break things. Big changes mean problems, and you have to accept them. This is where management has a large part in determining an outcome.

* Stop with the silver bullet

Beware the conversations like "OMG did you see that new open source ... THIS NEW LANGUAGE HAS... Twitter is doing it this way because..."


Your problems are your own. Twitter is Twitter. Facebook is Facebook. Your company is different unless you're making a direct clone of those.

And adopting any solution -- internal or external -- incurs overhead. One must recognize what the tradeoff is, especially if that solution is open source and unlikely to be popular amongst other adopters (and hence fade away, like many solutions have done over the years).

* Please just use C and C++ when warranted

They're really not that hard. You can write very simple C++ using Boost and have it be a billion times better than some randomness you've twisted into a pretzel to avoid writing in C++. Again, every abstraction you add to avoid writing in these languages incurs its own overhead down the road (see the silver bullet point)

Do not be afraid*

* - Is written 365 times in the Bible? What's up with that.

Thursday, April 03, 2014

Why Node is the Future of the Web Tier

Everyone I know hates Javascript. Including people who do it professionally.

I hate Javascript. I long for the day where it's been completely destroyed in favor of something else, I don't even care what. Typescript and Dart both look really promising, though I question whether either will ultimately make a dent in the dominance of Javascript.

Node is a gigantic hack. A browser Javascript engine pulled into the server layer? Single threaded?

Node is slower than most alternatives. Even the most rudimentary JVM-based framework will blow it away.

And Node is the future? Yeah, it is. I told my coworkers this the other day in our internal IRC and they couldn't believe it. I thought I should explain my position a little bit more clearly in a blog post.

The reason is a people reason

The server-side web tier is quickly becoming the place where no one specializes. At our company, we have "Front End" and "Back End" positions to hire for. What does that mean?

  • Front End: Javascripty, CSSy and HTMLly stuff.
  • Back End: Servery, Databasey stuff
As the front end becomes more sophisticated and contains more logic, the Back End folks are no longer interested in writing a simple DAL/CRUD web-tier for Front End people to call into. That kind of work is a solved problem, and if the real interesting application logic lives in Javascript, it's no fun. Rather, they're more interested in working on scalable internal services and systems that the lightweight web tier can call back into and work on.

This was our problem when I was back at Groupon. No back-end systems people wanted to take on new work in our Rails stack. The API layer, and related search and database services, sure. But not the web tier. So when it came to do a rewrite, who would own this and what tech would thye use?

The answer is, the front end people need to own this web tier. It cuts down on iteration time and makes for clear ownership. Then back-end people will focus their efforts on scalable systems in Java that the web tier calls out to.

Small companies have been cutting this loop for a long time with Node, but now you see major companies making this transition. Wal-mart. Groupon. Front end tooling relies on Node: Twitter relies on Node for Bower. Microsoft is supporting Node for Windows and has several projects that use it. Everyone everywhere uses it for unit testing their JS.

And it's trivial to get started -- something that seems to lead to adoption in the modern age. NPM is really easy to use and getting a basic Node site set up is easy. There's a lot of hosting as well.

Anyway, I hope this shows why Node and Javascript will continue to eat the future, even though everyone hates it and it's a gigantic hack. Don't ever forget the wisdom of Stroustrup: "There are only two kinds of languages: the ones people complain about and the ones nobody uses".