Dynamic Typing > Static Typing?

I stumbled upon this amazing video called “The Unreasonable Effectiveness of Dynamic Typing for Practical Programs”

It’s about 50mins long and unfortunately vimeo has no fast playback options like youtube so I’d suggest you download it and view it in VLC then click Ctrl+ or Cmd+ to speed up

In any case he makes a very compelling argument that static typing isn’t really all that. His argument is LOOK AT THE DATA!!! Everyone has opinions. People believe static typing catches bugs. People believe static typing helps document code. People believe static typing makes IDEs work better and therefore save time, etc. BUT … those are all just beliefs not backed up by any data. He points out that so far all the studies suggest that’s not true. He claims 5 years ago he was a static type believer but after looking up the research and doing some of his own he’s convinced static typing is a net negative.

You should really watch the talk but I’ll try to sum it up here I hope he doesn’t mind I repost some of his slides here. Hopefully they’ll encourage you to watch the talk

This first slide is from a research paper where the researcher wrote his own language and make both a statically typed and dynamically typed version then got a bunch of people to solve programming problems in it. The results were that the people using the dynamic version of the language got stuff done much quicker

What was most interesting was that he tracked how much time was spent debugging type errors. In other words errors that the statically typed language would have caught. What he found was it took less time to find those errors than it did to write the type safe code in the first place.

The guy giving the talk, Robert Smallshire, did his own research where he scanned github, 1.7 million repos, 3.6 million issue to get some data. What he found was that there were very few type error based issues for dynamic languages.

So for example take python. Out of 670,000 issues only 3 percent were type errors (errors a static typed language would have caught)

For all the dynamic languages he checked only 2% of the errors were type errors

His point there is that all that static boilerplate you write to make a statically typed language happy, all of it is only catching 2% of your bugs. You must write tests to catch the other 98% of bugs regardless of using a static typed language. And, given the studies the time you spend on writing static typed boilerplate code is taking way more time than the you would have saved.

Some other study compared reliability across languages and found no significant differences. In other words neither static nor dynamic languages did better at reliability.

But dynamic languages took significantly less effort to create (less time)

Part of that was reflected in size of code. Dynamic languages need less code.

One kind of side point he brought up was tons of projects that are written in a statically typed language end up adding a second dynamic language inside. He basically said given all of the stuff above you should just have used the dynamic language in the first place then get all of the above benefits AND you’d be able to use the same language as your “embedded” scripting language

I think all of that was basically what I took away from the talk.

Someone asked what about IDEs and auto completion. That’s when he mentioned he was originally a static language fan and loved C# and resharper and intellisense and all that stuff but the evidence doesn’t support that you’re actually being helped as much as you think you are. If the auto completion was all that you’d expect development in static languages be faster than dynamic languages but that doesn’t seem to be the case. He gave one off the cuff idea that maybe in fact the auto completion is mostly helping you write all the unneeded boilerplate a static language demands.

He points out for example when he’s in python he misses the auto completion and yet he’s still more productive in python than C#. He understands and agrees that it feels intuitive that auto completion would be a net positive and before he started researching he fully believed that was the case but basically the evidence doesn’t support the belief that the IDE is making you go faster.

Another point he made is that writing static types is often gross and unmaintainable whereas writing unit tests not. He gave this example of some C++ he wrote once that he was so proud of. It was code to make it possible to pass an arbitrary number of arguments of arbitrary type to a factory function. Something that’s one line in most dynamic languages.

Static types are also anti-modular. You have some library that exports say a Person (name, age ..). Any code that uses that data needs to see the definition for Person. They’re now tightly coupled. I’m probably not explaining this point well. Watch the video around 48:20.

In the end someone asked “what about perf?”. He basically said yes, static languages are generally faster than dynamic languages although of course lots of work has been done narrowing that gap examples being V8 etc… What he really wanted to say wasn’t that you should always choose a dynamic language. Rather, you should really make the choice with real data and weigh the true pros and cons vs your gut beliefs which might not actually be true.


Update:

Someone on HN posted this link to other research which has mixed results?

The HN comments are here.

It was disappointing to me that 80% or more of the comments responded only to the title and clearly didn’t actually read the summary. It appeared as if almost no one actually watched the video. Most people just clung to their beliefs clearly ignoring any evidence to the contrary.

  • njy

    “The results were that the people using the dynamic version of the language got stuff done much QUICKER” (emphasis mine)

    And? I mean, you write code faster, good, wow.

    What about reading code, refactoring code, maintainig code, big codebases to go through and so on?

    Writing a 500 loc code isn’t hard. You know what’s hard? Writing, using, collaborating to, maintaining and evolving a 100k loc code. (semi-cit)

    That is one of the reasons why, in my opinion, something like Typescript is emerging and getting traction: dynamic when you want, static (or kind of) when you want.

  • kirindave

    I cannot succinctly express how deeply flawed the methodology and comparisons of this article are, other than to claim it’s there. I’m going to have to unpack this elsewhere.

    But. Prechelt’s work (https://page.mi.fu-berlin.de/prechelt/Biblio/jccpprt_computer2000.pdf) compares the lamguage landscape 20 years ago, and even then drew fire for not actually using the modern variants of static typing as its basis, instead focusing on the antiquated typing of Java and C++ (of 1999) and deliberately avoided allowing C++ templating. It was a stacked deck and out of date.
    This sentiment you’re expressing has been on repeat for 10 years and almost exclusively is told by people who only have exposure to very bad type systems that are left as slavish copies of legacy type systems, “for ease of learning.” Modern type systems like OCaml and Haskell are not so incomprehensible as dynamic typing proponents would want you to believe and often times have fewer type checks (in the form of things like nil guards, chained Boolean operators using nil punning, etc) than dynamic code.
    What’s more, WRITING code speed is the most vacuous and saccharine of all metrics. Dashing out code promptly feels good, but is almost never the actual problem when it comes to launching products to production.

  • Craig Perry

    The most effective method of maintaining and growing a large complex code base that I know of is to reduce its size. Everything you mentioned is easier in a 50kloc code base than in a 100kloc. Even in mature, optimally refactored implementations (if such a thing exists!), your tests will still run in half the time and so on.

    Dynamic languages are one very effective way to reduce the size of a code base. Not the only way, but even taking approaches like creating a DSL can also be used in a dynamic language, magnifying the saving.

    There is a limit though, I don’t know where the limit sits exactly but i don’t propose we all begin writing in a notoriously terse dynamic language like J!

  • Jesus Bejarano

    That’s theory, there is not really evidence that static typing output more quality, maintainability and so on of a product. Also things like typescript are illusion of the “robustness” of static type systems, they still let a lot of responsibility to the human so there is not guarantee for the same “benefits”.

  • njy

    > That’s theory
    Nope, that’s practice, in a lot of peoples’ experience.

    But let me be clear: *theoretically* a person can create absolutely perfect, performant, bug free and readable code, in any language and in any editor.

    Heck, *theoretically* you don’t even *need* syntax highlighting, code completion and all that stuff: it is not *strictly needed* .

    Still, I think I can assume those sure are a welcome addition that can *ease* the process, right?

    Now, more *realistically* : if a team of 20 different people work on a big project that also involves refactoring, calling other people’s code and whatnot, I think we can safely say that knowing – for example – the type of parameters when you are writing a piece of code is quite handy, right? I know, there are the docs, but it’s not I don’t know what a specific method does, it’s just I don’t remember and I don’t want to remember the position of each friggin’ param.

    > Also things like typescript are illusion of the “robustness” of
    > static type systems, they still let a lot of responsibility to the
    > human so there is not guarantee for the same “benefits”.
    Absolutely! Just like seatbelts: we should not use them because they are just an illusion of safety, because you really have no guarantee, right? I mean, there’s still a lot of responsibility in the hands of the driver not to run off a clif 🙂

  • We all feel this. But we only have feelings. I feel more productive when I have an IDE that auto completes. Not because I use the auto completion but rather because it helps me find docs. If I type

       List strings = ???
       strings.
    

    The IDE will help me find all the things I can do with strings.

    The question is rather is my perception that this is helpful actually a net positive or is it just a feeling.

    The same is true with maintainable code. Is strict typing actually preventing enough errors to make up for the verbosity and other issues? Is the implicit documentation actually making people more productive? I don’t know. This talk is claiming we all feel it’s true but there’s not proof only feeling. You might disagree with his findings but I think the important take away is we need more than feelings. We need measurements.

    There are giant projects in dynamic languages. Facebook comes to mind as a giant project with thousands of programmers all written in dynamic languages.

  • This is true but if you watch the video the guy does have experience with modern strictly typed languages. He uses F# as an example of an amazing modern strictly typed language and then goes on to show how if fails to actually be typesafe. I think his point was that you still need tests.

  • rak

    As a Python programmer, I’m curious if the omission of KeyError as a potential type error is due to some analysis of how explicit dicts are used in actual public Python code, and to what degree practices vary in other here’s-your-hash-map-and-your-vector-now-go-forth-into-the-world languages like JavaScript.

  • njy

    I agree that the thing about perception vs numbers is interesting but, while trying to avoid guts and going with a more scientific approach is right, doing so measuring only “the writing speed of a small piece of code” and not taking into account everything else like maintaining it, multiple people writing the same codebase, using other people’s code, refactoring, etc seems a little odd with this hypothetically more scientific approach.

    Btw, 10 years ago the whole thing about higher verbosity for statically typed languages could have been right but today, (using C# just as an example) with type inference, the dynamic thing, anonymous types, lambdas, auto properties and so on, does it really have all this friction?

    > There are giant projects in dynamic languages.
    > Facebook comes to mind as a giant project with
    > thousands of programmers all written in dynamic
    > languages.
    I agree, but it is also true that, for example, Geoff Crammond’s Grand Prix 5 was huge, and almost entirely written in assembler. So there’s that.

  • It’s trivially easy to set up code completion or ‘go to definition’ in Python in vim. I presume it’s as easy for other editors and other dynamic languages. The fact that the completion is flaky in the tiny 0.1% of cases where somone is doing something dynamic with the methods on a class makes bugger all difference.

    I spent fifteen years being a hardcore C, C++ and C# zealot. I was wrong. I had formed opinions through ignorance.

    There is no evidence to suggest large projects are any harder in dynamic languages. Personally, I now suspect they would be easier.

  • What you say is absolutely true, but most people are still using Java or C variants, so it makes sense that the discussion, in practice, should often focus on “Would switching to Python be better for me than my current C#?” I agree though, that if we could extend that discussion to include more esoteric languages, that would be awesome.

  • One of the main strengths of dynamic languages is that they make it easier to read, refactor and maintain code.

  • Just for a concrete example, the equivalent operation in Python is to invoke ‘help’ on a class or instance (and *everything* is an instance, including functions, modules, etc), and the return is the docs for that, which includes handwritten docs, but also enumerates methods with arguments, properties, inherited stuff, etc. Any IDE can invoke ‘help’, to produce tooltips or whatever.

  • njy

    “One of the main strengths of dynamic languages is that they make it easier to read, refactor and maintain code”

    https://s-media-cache-ak0.pinimg.com/originals/4d/32/f1/4d32f142871c29466f303c2c80f24ed4.gif

    Ok, I’m out 🙂

  • kirindave

    Personally, I think C# is a lot further along than Python and I don’t see the value add there. Nearly all the thinking you’d normally do in C# is the same thinking you’d do in Python except for explicit type tags on values.
    And unlike Python, C# has excellent facilities for handling errors in asynchronous code (python is terrible at this) along with real higher order function support (something Python has stubbornly refused to truly add on the grounds that it’s too complex, despite the heavy adoption of functors and CPS in javascript questioning this assertion thoroughly).

  • kirindave

    The “F# typing troll” is well known. There is not a single new argument presented in this video.

    True type safety is actually almost never what we want. In fact, it’s a red herring because external interactions outside your program can never truly be captured by the type system (even aggressively pure languages like Haskell can only say IO ‘a, and even THESE provide trapdoors like “unsafePerformIO.”

    The question shouldn’t be, “How fast can I write code.” There is no race to write code, there is a race to build and ship a product, then the marathon to sustain it. Even if there was solid data that you can “write code” faster with a dynamic language, that wouldn’t settle the dispute because there’s a lot more static typing enables, particularly in the functional domain.

  • kirindave

    The entire thesis of “harder” or “easier” is a red herring. These qualities are themselves subjective, domain specific, and everyone here keeps pretending that it’s some sort of binary and easily discernable rubric.
    I have a 40kloc clojure codebase and I yearn for static typing in so many places. I was your opposite and now I think Haskell, Ocaml, Nim and F#, are effing great. But I don’t use them in *every* domain nor is language my first or even most important decision when starting a project.

  • Stefan Hanenberg

    (sorry for the long text)

    “People believe static typing catches bugs. People believe static typing
    helps document code. People believe static typing makes IDEs work better
    and therefore save time, etc. BUT … those are all just beliefs not
    backed up by any data.”

    In the meantime, there is a number of studies out there that actually give evidence for these claims.

    1. “People believe static typing catches bugs”:
    Yes, the study by Hanenberg et al. has shown that for type errors, there is an extraordinary large difference between statically typed and dynamically typed languages (http://dx.doi.org/10.1007/s10664-013-9289-1) — statically typed language are MUCH faster (no wonder…since the type system points to the source of the error).

    2. “People believe static typing
    helps document code”:
    Yes, the study by Endrikat et al. has shown (at least in the given experimental setting) that the documentation effect of static type systems that make use of type annotations have even a stronger effect than the one given documentation (http://dx.doi.org/10.1145/2568225.2568299).

    3. “People believe static typing makes IDEs work better
    and therefore save time”
    After a series of experiments, we conducted an experiment with IDE support (which is slightly unfair, because we know that things such as code completion for dynamically typed languages is not that strong in most IDEs). We ran a repetition of a previous experiment with IDE support and the experiment by Petersen et al. shows that the positive effect of type systems with respect to development time in the presence of an IDE (we used Eclipse) is even stronger (http://dx.doi.org/10.1145/2597008.2597152).

    4. “… If the auto completion was all that you’d expect development in static
    languages be faster than dynamic languages but that doesn’t seem to be
    the case. ”
    At least we ran one experiment that compared the type system effect with the code completion effect using VisualStudio: the experiment by Fischer and Hanenberg got the result that the positive effect of static type systems is much stronger than the code completion effect (http://dx.doi.org/10.1145/2816707.2816720).

    Well, there are some more experiments that we ran in the past, but the most important result of them is: in almost all cases we showed the positive effect of static type systems and hardly any situation where dynamic type systems had a positive effect.

    The first figure in your text is from my study in 2010 – and it should be noted that this study did NOT keep correctness as a constant factor in the experiment (which I consider now as a major flaw in the experiment, http://dx.doi.org/10.1145/1869459.1869462). And I think in general we should be very careful when interpreting results of programming experiments that did not keep correctness as a constant factor.

    I think that it is problematic if a single study is used in the argumentation – in the meantime there is a series of experiments about static vs. dynamic. And we got results. And so far, we were not able to show a benefit for dynamically typed languages. Maybe they exist, maybe not. But so far, there are no experiments available that actually show such a benefit (which makes me sceptical that the often cited benefit of dynamically typed languages does actually exist).

    The other experiment used in the text by Prechelt is based on self-reported time (and does not control any other factor such as IDE, libraries, etc.). I like Prechelt’s studies a lot in general , but I have the tendency to say that we should just use the study as a first, small indicator.

    I am personally quite sceptical with respect to repository studies: even if most repos hardly contain type errors, this does not mean that type errors do not play a major role. It only says that at the moment when people push into the repro such errors only play a minor role (maybe because they fixed it before, maybe these errors did not show up at all).

  • Isaac

    Not sure about that bit about F# being a structurally typed language. It has *some* features for structural typing but it’s certainly not a basic component of the language. It sits on .NET, it’s a nominally typed language. You can’t treat two different types as one even if they share the same fields.

    I also thought that the bit about static languages = you must explicitly state types (compared to dynamic types) was a strange straw man argument because this is then contradicted soon afterwards anyway! *Some* static type languages use explicit typing. Most more modern static languages at least to some extent don’t.

  • Storm

    Dynamic typing is a damn nice tool. But I want to turn it off when I know I don’t want it. I like my “Use Strict”. Accidentally defining a dyslexic version of a variable, causes strange behaviors, when you don’t realize that UserPreviousColorRGB and UserPreviousRGBColor are both in use, and due to the proximity of most of the code it works fine, except in that one case where it is used far from where it is set to begin with. And then it is all manner of odd as why some color from 4 changes back suddenly occurs when you undo the drawing of a circle. Then you try to reproduce it.. and no it doesn’t want to reproduce, because it was some odd edge case, that is a monster to find, eventually you question the sanity of the bug reporters. Job security is achieved.

  • HaakonKL

    You already have things like telling you what arguments a parameter expects in IDEs for Common Lisp, which is dynamically typed, and actually really easy to be productive in.

    So the tools that you want aren’t exclusive to statically typed compile-then-run languages.

  • Lavrenty Beloletov

    > This first slide is from a research paper where the researcher wrote his own language and make both a statically typed and dynamically typed version then got a bunch of people to solve programming problems in it.

    I suppose there is a world “small” missing there. Those are synthetic tests and, of course, static typing would suck on them.

    > he scanned github, 1.7 million repos, 3.6 million issue to get some data. What he found was that there were very few type error based issues for dynamic languages.

    You don’t say! Has the guy scanned the time spent by the developers to debug all those type errors and make unit tests pass before the first version was pushed?

    > Some other study compared reliability across languages and found no significant differences.

    I compared reliability myself and found significant differences in some areas.

    > But dynamic languages took significantly less effort to create (less time)

    That’s true. While the code to be created is less than 300 LOC.

    > One kind of side point he brought up was tons of projects that are written in a statically typed language end up adding a second dynamic language inside.

    …which is completely typed implicitly by the host language type system. If he means that he implements something like “eval(“small_lang()”)” then he builds yet another IT cancer and he is just a newborn dynamic typing fanboy. See “God methods”.

    Anyway, where are the numbers?

    > Another point he made is that writing static types is often gross and unmaintainable whereas writing unit tests not.

    Except that unit tests do not prove anything. Again, where are the numbers?

    > writing static types is often gross and unmaintainable

    Fanboyism.

    > V8

    Not really related but V8 in fact tries to statically type the program. And falls back to simple dynamic interpreting if it can not do that. Which usually means, I suppose, that the source program is incorrect.