Dynamic Typing > Static Typing?

2016-01-19

I stumbled upon this amazing video called "The Unreasonable Effectiveness of Dynamic Typing for Practical Programs"

It's about 50mins long and unfortunately vimeo has no fast playback options like youtube so I'd suggest you download it and view it in VLC then click Ctrl+ or Cmd+ to speed up

In any case he makes a very compelling argument that static typing isn't really all that. His argument is LOOK AT THE DATA!!! Everyone has opinions. People believe static typing catches bugs. People believe static typing helps document code. People believe static typing makes IDEs work better and therefore save time, etc. BUT ... those are all just beliefs not backed up by any data. He points out that so far all the studies suggest that's not true. He claims 5 years ago he was a static type believer but after looking up the research and doing some of his own he's convinced static typing is a net negative.

You should really watch the talk but I'll try to sum it up here

I hope he doesn't mind I repost some of his slides here. Hopefully they'll encourage you to watch the talk

This first slide is from a research paper where the researcher wrote his own language and make both a statically typed and dynamically typed version then got a bunch of people to solve programming problems in it. The results were that the people using the dynamic version of the language got stuff done much quicker

What was most interesting was that he tracked how much time was spent debugging type errors. In other words errors that the statically typed language would have caught. What he found was it took less time to find those errors than it did to write the type safe code in the first place.

The guy giving the talk, Robert Smallshire, did his own research where he scanned github, 1.7 million repos, 3.6 million issue to get some data. What he found was that there were very few type error based issues for dynamic languages.

So for example take python. Out of 670,000 issues only 3 percent were type errors (errors a static typed language would have caught)

For all the dynamic languages he checked only 2% of the errors were type errors

His point there is that all that static boilerplate you write to make a statically typed language happy, all of it is only catching 2% of your bugs. You must write tests to catch the other 98% of bugs regardless of using a static typed language. And, given the studies the time you spend on writing static typed boilerplate code is taking way more time than the you would have saved.

Some other study compared reliability across languages and found no significant differences. In other words neither static nor dynamic languages did better at reliability.

But dynamic languages took significantly less effort to create (less time)

Part of that was reflected in size of code. Dynamic languages need less code.

One kind of side point he brought up was tons of projects that are written in a statically typed language end up adding a second dynamic language inside. He basically said given all of the stuff above you should just have used the dynamic language in the first place then get all of the above benefits AND you'd be able to use the same language as your "embedded" scripting language

I think all of that was basically what I took away from the talk.

Someone asked what about IDEs and auto completion. That's when he mentioned he was originally a static language fan and loved C# and resharper and intellisense and all that stuff but the evidence doesn't support that you're actually being helped as much as you think you are. If the auto completion was all that you'd expect development in static languages be faster than dynamic languages but that doesn't seem to be the case. He gave one off the cuff idea that maybe in fact the auto completion is mostly helping you write all the unneeded boilerplate a static language demands.

He points out for example when he's in python he misses the auto completion and yet he's still more productive in python than C#. He understands and agrees that it feels intuitive that auto completion would be a net positive and before he started researching he fully believed that was the case but basically the evidence doesn't support the belief that the IDE is making you go faster.

Another point he made is that writing static types is often gross and unmaintainable whereas writing unit tests not. He gave this example of some C++ he wrote once that he was so proud of. It was code to make it possible to pass an arbitrary number of arguments of arbitrary type to a factory function. Something that's one line in most dynamic languages.

Static types are also anti−modular. You have some library that exports say a Person (name, age ..). Any code that uses that data needs to see the definition for Person. They're now tightly coupled. I'm probably not explaining this point well. Watch the video around 48:20.

In the end someone asked "what about perf?". He basically said yes, static languages are generally faster than dynamic languages although of course lots of work has been done narrowing that gap examples being V8 etc... What he really wanted to say wasn't that you should always choose a dynamic language. Rather, you should really make the choice with real data and weigh the true pros and cons vs your gut beliefs which might not actually be true.


Update:

Someone on HN posted this link to other research which has mixed results?

The HN comments are here.

It was disappointing to me that 80% or more of the comments responded only to the title and clearly didn't actually read the summary. It appeared as if almost no one actually watched the video. Most people just clung to their beliefs clearly ignoring any evidence to the contrary.

Comments
Stack Overflow Attribution Required
GDC 2016