Static code analysis tools, the future of code reviews, and our investment in DeepSource

By Alessio Fanelli (@FanaHOVA) and Nnamdi Okike (@nokike)

Code review

A key investment theme at 645 Ventures is the "Engineering Value Chain Revolution", which argues that every company is turning into a software company, and that the tooling used by developers will change drastically over the next few years. One of the core parts of software development that is in need of change is code reviews. As the size of engineering teams grows within organizations, there’s an increased need for tools that allow individual contributors to move fast, while at the same time maintaining code quality and maintainability.

In the 2019 Stack Overflow Developer Survey Report, ~77% of developers reported that they do code reviews, with 56% of them spending 4 or more hours a week on code reviews, and a whopping 14% doing more than 10 hours a week of them. If you assume that every hour of code review requires another 30 minutes to add possible changes, you can quickly see how this process takes up a prohibitive amount of engineering time.

The need to improve this process has spawned multiple companies in the static code analysis category, such as Codacy, Code Climate, and SonarQube. These products pioneered the market by allowing developers to implement continuous quality checks into their development flow. The tools help you track how many outstanding issues are in the code, how many new ones you might be adding, as well as the overall health of the codebase.

Alessio, our Director of Engineering, has used multiple of the existing static code analysis tools. While they were all somewhat useful, he was never fully satisfied by the experience:

  • High noise / signal ratio: code reviews are only helpful if you’re able to learn something from them and take action on the code to improve it. The way these tools work is the opposite; they optimize for coverage, not for usefulness. They flag obscure anti-patterns, a lot of style suggestions, etc. What that results in is a lot of noise that makes it hard for developers to prioritize; this in turn leads to loss of trust in the tool.

  • Lack of prioritization: the way issues are presented in existing tools doesn’t help developers prioritize what to do. If most of the issues you first run into aren’t worth fixing, you probably won’t dig deeper. Given how long some of these tools have been around, you’d expect the issues were cleverly prioritized based on the percentage of developers that actually fix them vs ignore them, as an example.

  • No virtuous cycles: because of the previous issues, interacting with these tools has never been fun. There’s nothing that makes me really want to return to the tool; it’s not saving me time, nor it’s materially improving my code quality. Most of them are thin wrappers around Rubocop, ESLint, and the likes.

Let’s take a look at a real life example, the analysis of Blazer, an open source BI tool built by Andrew Kane (One of my fav open source devs!) at Instacart. We’ll use a fork of it our team has contributed to, so that it won’t change in the future:

  • Codacy: https://app.codacy.com/manual/alessio_2/blazer/issues/index
    • 788 total issues, 585 of which are style issues. It created 96 separate issues for incorrect indent in a Markdown file, which are all false positives. The first few pages are all repeats of that same issue.
    • 79 security issues are flagged, most of which are “Function Call Object Injection Sink”, and most of them are triggered by strings that are in the code, not by user inputs, making it a non-issue.
  • Code Climate: https://codeclimate.com/github/FanaHOVA/blazer/issues
    • 841 total issues, 544 of which are style issues, 297 are about duplicate code.
    • The top issues are suggesting to refactor jQuery and Vue.js down to 250 lines of code because their files are too long. That's a code golf task I wouldn't want to take on.

The developer experience on DeepSource is superior to the others, and really allows engineers to focus on what matters rathern than being inundated by false positives. This is one of our core beliefs around code reviews: humans' time is best spent working on high level architectural reviews rather than the nitty gritty of a method implementation.

We first learned about DeepSource through Uber's Ludwig blog post, and Alessio was later introduced to Sanket by Ivan Shcheklein, founder of Iterative.ai and DVC, who was also a user of the product. Sanket and Jai both have expertise in shipping quality code, having contributed to some of the largest Go and Python projects out there, as well as helping reviewing code for Mozilla’s projects marketplace. Both Alessio and Nnamdi met with Sanket and Jai at the AWS Loft, talking about the shortfalls of current products, some of the learnings from Google’s publishing on their internal tools, and what could be done to really unlock the potential of this market, like their Autofix feature. We could tell early on that they had thought this market through and were prepared minds on it.

There’s a few things that got us excited about the future of DeepSource and code reviews:

  • Make the tool work for the engineer: existing solutions inundate you with possible issues, but don’t really do anything to help you fix them. DeepSource is looking at the problem the other way around: since a lot of the issues to be fixed are repetitive, the software should be able to fix them for you. This is what they called “Autofix”, which is now released in developer preview. With one click, DeepSource will create a pull request for you to fix the issue across all files in your codebase.

autofix-screen

  • <5% false positives: Google’s internal target for their tool is <10% false positives. DeepSource’s target is to keep that at 5% or below. That can be achieved by putting sane defaults in place (as an example, turn off style issues by default) and giving engineers the ability to selectively mute issues either by file or across the codebase. By lowering false positives, you build trust with developers, who will in turn be more willing to interact with the platform. Sanket wrote a very good blog post about how DeepSource makes this possible.

  • Every notification also helps you learn more: Each issue in DeepSource comes with a description of why it’s bad to do things this way, and how you should do them instead. Take this issue as an example; other than telling me that I’m using a Date object the wrong way, it explains why it’s wrong (i.e. Date objects don’t carry information about timezones) and it shows how I should fix it (Change Date.current with Time.zone.today instead). This kind of thoughtful features can help junior developers get up to speed quickly while at the same time reducing the amount of time senior developers have to spend on pointing out these smaller issues, giving them time to focus on more impactful ones like code architecture.

We’ve been using the tool internally for a few months now to monitor our Voyager platform, and have had great experience with it. We decided to lead DeepSource’s $2.6M seed round out of Y Combinator, with participation from YC, as well as other great investors and operators like FundersClub, Pioneer Fund, Liquid 2 Ventures, Christopher Golda (Rogue Capital), Timothy Chen (Essence fund), Ivan Kirigin (Tango VC), Ed Roman (Hack VC), Jakub Jurovych (Founder, Deepnote), Mike Viscuso (Co-founder, Carbon Black), Venture Souq, Tokyo Black, Bradley Buda (Founder, Census), John Kinsella (Ex VP Engineering, Qualys), Soso Sazesh (Founder, Growth Pilots).

As of June 16th 2020, developers have fixed over 3.7 million issues reported by DeepSource. Since the launch of Autofix in February 2020, almost 10% of these issues were automatically fixed. We’re looking forward to working with Sanket, Jai, and the rest of the DeepSource team to revolutionize the way code reviews are done, allowing developers to ship better code faster.