Code Reviews - easy in theory, difficult in practice

Written by John Mikael Lindbakk - 2020-05-01

The other day some of my coworkers and myself had a discussion over what it meant to do a code review. It turned out, to my surprise, that many of us had wildly different opinions on what it meant to have a pull request approved and what can be expected from the code reviewer. Some believed that code reviews are to check for bugs, performance issues and so forth. Others indicated that it is to check that we have made the correct thing. That the code does what the customer wants it to do. Others thought it was to check for code quality. This discussion kept swirling in my head for a couple of days, and is the reason this post even exists.

My goal with this post is to share my thoughts and opinions on the various aspects of code reviews, and maybe drop a few tips & tricks. I hope to explain what I consider a code review to be, the various roles involved in the process and how to tackle common problems. Admittedly there will be gaps, while writing this post I always found more things to talk about and if I had kept going this would turn into a book - which I didn’t feel like writing.

Code reviews come in multiple forms, but the one most commonly found are code reviews initiated by pull/change requests. Throughout this post, whenever I say code review I am referring to it in the context of a pull request. There are other forms with different goals and purposes, but those are outside the scope of this post.

The post is also assuming that the code reviews are done in some kind of online tool, like Github, GitLab or Bitbucket.

Before continuing we should define a few words which I will be using throughout this post

Author: The author is the person who wrote the code that is under review.
Reviewer: The reviewer is the person who provides feedback on a code review.
Comment: A text based comment on the online tool (i.e a comment in Github).

What is a code review?

As we saw with the introduction to this very post, different people have different views on what the goal of a code review is. Some people think that it is about finding technical issues, like bugs and so forth, others view it as a tool to verify the functional requirements of the code. Point is that there’s several ways of looking at it, and I encourage you to have a think about what it means to you before continuing reading this post.

Since we accept that different views exist, and that those views are valid, means that each team should define what code reviews mean to them. When the team has decided what a code review is, certain bounds of what is acceptable to comment upon and what is not will be automatically set based on that definition. For example teams which primarily focus on spotting bugs and mistakes in their code reviews might find comments on style and naming to be unnecessary. Other teams which have their focus on clean code might welcome them. Having an explicit definition helps people to figure out whether a comment is helpful or not.

As you are reading this, you are probably aware that I also have opinions on what a code review is. An opinion I am happy to share:

A code review is…

a process whereupon acceptance will put the code into the most holy, the code-base. Therefore developers can and should criticize all aspects of the code which does not abide by the teams’ chosen coding standard.
a place to praise when one sees good work or learns something new.
a place to share knowledge.
a place of mutual respect. Everyone wants to achieve the same goal - good, clean, beautiful code. Personal attacks have no place in code reviews.
a place where it is okay to criticize code without it becoming personal.

A code review is not…

a process for finding technical problems like, but not limited to, bugs, security and performance issues.
responsible for verifying the functional requirements.
a place for large architectural discussions.

Wait, we’re not supposed to find bugs in a code review!?

This is probably what some people might view as the most controversial part of this post, which is why I feel that I have to justify it. Software development has a lot of separate processes which creates a cohesive whole. Automated testing is a good example of such processes where unit tests verifies the internal logic of the application, integration tests verify the bits where the internal logic must interface with the rest of the world and system tests verify things on an even higher level. If we only do one of these practices we leave gaps in our testing, but combined they all fill the gaps left by the others.

While it is less automated, code reviews is just another process. The question is which need it should fulfill. Which part is not already covered by another process? To me it is code quality. We might have linters and static checks which can find some of the issues, but we have no tool which does a great job to sort out code quality. We already have processes for code correctness in the forms of automated tests. We already have processes for functional correctness in the form of manual tests, acceptance tests, alpha/beta releases and so forth. Finding issues and checking that we’ve made the right thing is already better covered with other processes.

Another reason why we should not expect people to find bugs and functional issues is that it is an unfair burden to put on the reviewers. A reviewer has, most likely, not been involved with the case all that much and has not been a part of the discussions. A reviewer will always be at a disadvantage when it comes to verify the correctness of a piece of functionality or code and will be prone to accept the wrong things or even suggest wrong solutions based on wrong assumptions.

If we trust our tests we know that the code is working before even reaching the reviewer. If we trust the checks for our functionality we know that we’ve made the right thing before the code reaches production. By focusing on code quality and best practices we fill a need in our development process which is not filled.

Does this mean that one cannot comment on bugs? Absolutely not! Those kinds of comments are welcomed and it is unprofessional not to mention them. This is not a limitation of what can or cannot be said. It is a limitation of what is expected from the reviewer.

That said other people disagree with this view, which is fine. I have myself worked in teams where the whole point of code reviews is to discover issues which can affect production. To me, however, such a focus is an indication that one doesn’t trust the other processes in the pipeline. So rather than focusing on making really robust tests we offload parts of the responsibility to the code review. This can work, but it's a fragile replacement, though people have made it work.

This is a good article from a company which would, probably, disagree with me:

https://blog.newrelic.com/engineering/code-review-guidelines/

The job of an author

The author is in a difficult position. They have to put in a lot of hours to code something, then willingly have it criticized by other developers.

Here are the general rules the authors should follow:

The author should have pride in the work, but not let it get in the way of criticism.
The author welcomes feedback.
The author is not afraid to ignore non-blocking comments when the author deems them trivial, but the author is also encouraged to complete them.
The author should never assume intended malice when reading comments. Such a mindset can only lead to toxicity and friction.
The author should argue back if he/she disagrees with a reviewers view. Debates will lead to better code.
The author should never blindly take a reviewer's suggestion, especially if it is only because the author wants to complete the pull request. Instead the author must consider the proposition and apply it only when the author agrees.
The author should communicate to the reviewers cases surrounding the code. I.e if there is any customer who is waiting for the feature, if there is a deadline to meet and so forth.

Prepare the pull request

The author should always respect the reviewer's time. Which is why the author should always make sure that the work delivery is complete and considered ready for production. A pull request is not the place to ask “is this okay?”, such questions should be asked earlier in the development cycle. Before the review starts minor mistakes like unnecessary spaces, empty lines, debug statements, temporary named variables and so on should be cleaned up. A good trick to avoid minor mistakes is to make sure all your code is committed and synced with the master branch. Look at the “create pull request” button and walk away. It doesn’t need to be for a long time, just for a coffee or something. When you come back, go through all of the changes, line by line, and see if you can spot any mistakes. If it all looks good initiate the review.

A pull request should be easy to read and the author should always strive to make the reviewers job as simple as possible. This is partly achieved by writing clear, clean and understandable code, however there are things that are inherently complicated. For complicated or larger pull requests the author should leave comments explaining the implementation, with the goal to make the solution understandable to the reviewers..

Whenever there is something unexpected or weird in the implementation, the author should leave comments here as well. These things can occur when one has to work with legacy code, or maybe to work around a bug or limitation in third party libraries. Such comments must make clear why these workarounds and oddities are required. The number of these comments should be limited, as having many of them might be an indication that there is something wrong with the implementation or system as a whole.

Make sure the pull request isn’t too big

A common mistake for beginners is to initiate huge code reviews, but there are several senior developers guilty of this as well (myself included). Big pull requests are difficult to accurately review as there are just so many changes. Instead functionality should be split up into smaller and more manageable chunks and spread out over several reviews.

Before starting implementing a feature, think through how big this will be and consider how one might split it up in a logical way. Combining feature toggles and mocking we can implement pars, without sacrificing stability. If splitting up the functionality is possible, then one can set up the branches like this:

Main dev branch | - feature branch 1 -> One part of the implementation | - feature branch 2 -> The second part of the implementation | - etc…

By doing it this way we have split up the implementation into manageable parts which can be reviewed separately.

Sometimes it is difficult to split up functionality like this, as one needs the full implementation at once. Generally we want to avoid this, but in the terms of code reviews we can make this more manageable as well. In such scenarios we might do something like this:

Main dev branch | - feature branch -> will eventually hold the implementation of the entire feature |- branch 1 -> holds parts of the feature |- branch 2 -> holds parts of the feature

This approach we are adding another layer to our branches. Basically we are then developing the first part of the feature in “branch 1”, which is a branch based on “feature branch”. When the first part is done we initiate a code review to the feature branch. There might be missing code and classes for the whole feature to be complete, but this can easily be mocked so that the branch builds and the tests run. The important part is that we get “branch 1” approved and merged to “feature branch”.

When “branch 1” has been merged, we can continue developing the next part of the implementation on “branch 2”. This process continues until the whole feature has been completed and merged into the “feature branch”.

By this point the whole implementation has been approved and we know the code is sensible. Some teams might want a look at the final implementation though, if that is desired then we may initiate a code review between “feature branch” and “main dev branch”, though generally this is not necessary.

The job of a reviewer

As reviewers we must respect the author's work and time spent doing that work. A reviewer must never assume incompetence or laziness when reading code. Such assumptions only lead to hostility and personal attacks.

Writing feedback is hard

The reviewer must balance the desire for perfection and the needs of the customer, while considering the emotions of the author. This is a difficult task, as people take criticism in different ways and to different degrees.

A comment should answer the 3 W’s (what, why, and how):

What is wrong? The reviewer should make it clear what the issue is as the author needs this information to actually know what to fix.
Why is it wrong? Not all problems are obvious. For situations where the problem isn’t obvious the reviewer must make clear why it is wrong, and if possible justify why it is wrong by pointing to industry best practices and agreed upon standards.
How can it be fixed? Not all problems have obvious solutions. By offering a proof-of-concept style solution can help the other reviewer and author to understand the initial problem. It can also lead to further healthy discussion as there is now a solution to build upon.

Note that not all comments must contain all three of the what, why and hows. For trivial stuff we might choose to only leave why. However if one leaves only comments for trivial stuff it might be an indication that one is only looking at the shallow parts of the implementation.

There are also a few other rules which a reviewer should follow:

Do the review in a timely manner. A review should not be stuck in limbo for days and weeks. If one truly does not have time, let the author know with a suggestion to when it can be looked at. If that is too late the author may have someone else review it.
A comment should indicate whether it is a blocking change or not. Blocking changes cannot be merged. Non-blocking changes are optional which the author may or may not decide to do.
Pick your battles. Teams must work together, and being too stubborn can lead to friction and hostility.
Try to learn and share knowledge. One can learn from reading code, and doing code reviews is an excellent way to do so. It is also an excellent way to share knowledge. For example if you know of a way to make a function 10 lines shorter without sacrificing readability and performance, use that opportunity to teach. Do not however demand these changes unless the original code is particularly ugly.
Be careful about the language that is used. Note that text based communication can sound very harsh to people. Do not say that something is wrong, instead show an example that you would consider better.
Read the code in both the online tool and editor, side by side. By having the code in the editor as well makes it easier to play with the code. Test out theories and proof of concepts. It will make it easier to understand the mindset of the author and how the code looks like it is.

Consider the customer

Customers are often waiting for features or fixes, and as reviewers we should be aware and respect such needs. One thing is to strive for absolute perfection, but we should not sacrifice our customers to achieve it.

A code review should not be in the way of customers getting what they want, therefore reviewers must have a pragmatic approach to the pull request. Let’s say that we are reviewing some code. This code has been written in such a way that it is not in line with how we usually do things, but it works. Fixing it might take hours, or even days. The customer however, is waiting for these changes. A good compromise in this situation would be to accept the pull request, but only if the author cleans up the work at a later, agreed upon, point. This way the customer gets what they need and eventually the reviewers get what they want. Everyone wins.

Compromise when possible

The relationship between author and reviewer is a delicate one. The sides have different values and goals. The author wants to provide value for their customer, show results to their manager and remove work of their kanban board. A reviewer has the goal of maintaining and raising the quality of work which is produced. These values can often collide as the quest for quality gets in the way of “getting things done”. As in the previous section we must find a compromise between the two.

As a reviewer we must be aware that perfection does not exist. We must also acknowledge that the author is a person with feelings. Not all battles are worth fighting, and too many comments can alienate the author completely. Reviewers must compromise between what is worth pointing out to keep the pace, at least for the minor things.

A reviewer should also be open to compromise, like we saw in the previous section where we merged a flawed solution. Sometimes it is better to have eventual excellency rather than immediate.

Code reviews must not be weaponized

Being a reviewer is a position of authority. We should be careful how we apply this role and not abuse it. As reviewers we should…

Be timely with our reviews. It should not take days or weeks to get something looked at.
Be professional and justified in our comments. The author should be able to see the reasoning behind comments.
Put our personal differences aside. It does not matter whether the author is the least likeable person in the universe as it is not about the author, it is about the code.

Never leave a review without a resolution

Most code reviews systems has only two states:

Approved
Needs work

Much like Schrödinger famous thought experiment on quantum position we don’t know what the state a code review is in before we take a look at it. Unlike Schrödinger’s thought experiment the superposition won’t resolve itself when we look at it, we must set it ourselves. Leaving a code review without resolving it only helps to generate uncertainty, especially if there is comment written. Either a code review is acceptable, or it is not. There is no in-between.

Whenever a reviewer has looked through the code, he/she must take a stance. If there are things which must be fixed before merging, the reviewer should mark the pull request as “Needs work”, effectively preventing the pull request from moving forward, which is responsible if there are things which must be fixed.

If everything is fine, or more often the things one has commented on is not that critical (for example like code style, naming and so forth), the reviewer may leave comments, but still mark the pull request as “approved”. This essentially renders all the comments optional which the author can choose to fix if they so desire, but it is not a requirement enforced by the reviewer.

The only time it is acceptable to leave a review without setting a state is when the reviewer feels that they do not have enough information to complete the review.. Often this comes in the form of hard to understand code, confusing requirements or workaround. In these situations the reviewer must seek out this information from the author, whereupon the author must be willing to provide such information.

Communication is key

We’ve all been in the situation where something has to go out, it has to go out now, but there are processes getting in the way. Sometimes that process is code reviews which points out things which, in the heat of the moment, seems insignificant compared to not pleasing the customer. It is true that the customer comes first. As with most things, communication is the key.

A solution I have found effective is by simply offering a compromise. Remember that code is not set in stone and can be fixed later - though one has to be very strict about it and actually do it. If the comments are largely about style, naming and other parts which do not directly affect the applications functionality a good compromise could be to merge now, fix later. (and I mean fix later, I do not mean “fix later”).

Most people want to help and provide value for the customer, so talking to the reviewers face to face can achieve a lot. If you feel that your code is being blocked from master - talk to people.

Bikeshedding

C. Northcote Parkinson was an author who first coined the term bikeshedding in his argument named “Parkinson’s Law of Triviality”. He had observed that a committee, who was responsible for approving plans for a nuclear power plant, spent most of their time on unimportant tasks. He saw that vast amount of time and effort was spent on the staff bikesheds rather than the design for the power plant itself.

We can all agree that having bike sheds for the staff is probably a good thing, but I think most of us will agree that a power plant being built properly is somewhat more important. So while one might want to argue over finer details, and might be right in doing so, not all things are important in the grand scheme of things.

This term gained traction in programming, especially when code reviews became a thing. Programmers love to discuss the minor, maybe even insignificant, parts of the code like tabs vs spaces, coding styles and naming. None of which actually does much for the functionality of the application whatsoever.

How to handle bikeshedding

Bikeshedding is something that one should bring up in code reviews. When one feels that a comment is on such a minor thing which has no real impact on the application one is allowed to point something out as bikeshedding. I have heard about teams adding their own bikeshed emoji to whatever tool they use for code reviews.

(protip: If you are limited to basic emojis, this is the best bikeshed combination: 🚴‍♂️🏠. This site has a few good ones if you can add your own emojis: https://emojis.slackmojis.com/emojis/images/1555984737/5645/bikeshed.gif?1555984737.)

Before we go out and start accusing everyone of bikeshedding we should probably talk about why bikeshedding occurs. The original meaning was that a group of people tend to focus on insignificant things, but in my experience there’s one or two doing the code review, not a group. What we often see is a difference of values. Some people, including myself, really think that naming and code style is important, others might see the focus on such things to be nitpicky.

We can always accuse people of bikeshedding, which is okay when true. Though bear in mind that bikeshedding does mean it is wrong. Someone can have a perfectly valid argument, but it being so trivial that it is not worth fixing. The author should consider the seriousness of each comment and push back when the author feels that something is too trivial. My experience has been that a “I’ll remember that the next time” is more than suitable for such minor things.

Sometimes people are really stubborn, even on insignificant things. When dealing with such comments the “3-comment” rule quickly gets triggered (we will talk about this later). When this rule gets activated you should go and talk to the reviewer in person, which hopefully should resolve the issue. If not, get a third member to break the tie.

If a reviewer is stubborn, correct and comments upon mostly the same things consider automating these fixes. Either through linters or editor plugins. If this cannot be done talk to the team and either have it accepted into the team code guidelines, or have it explicitly pointed out that commenting on scenarios like that is pointless.

Use a linter!

Coding style is often the prime suspect for bikeshedding comments. The solution for this is to use a liner. A linter is a tool which analyzes the source code for various things, but the most important thing in this context is that it checks for stylistic errors. Most linters have a set of rules for what is considered acceptable looking code and what is not. By using a linter one should never see, nor write, a comment which complains about style again.

With a linter people blindly accept that whatever it says is okay is within reason. If people disagree with this they can make a pull request to wherever the linter ruleset where they can argue why the change to the rules would be an improvement.

The boundary of code reviews

Code reviews are great, but it is not suitable for any kind of feedback. Code reviews tend to be a bad place for having long discussions. Conversations about the overall architecture, long chains of disagreement and so forth does not work well in a code review. These kinds of conversations should be moved to be taken in person.

This is especially true for architectural discussions. No reviewer should comment on things which would rip the foundations from the solutions and force the author to rewrite everything. Those kinds of discussions should be had prior to a pull request. If one does spot a bigger architectural problem it should be brought to the team and fixed, but it should not stop the pull request.

The 3-comment rule

Arguing is fun, but pull requests tend to be a poor medium for long discussions. Rather than generating a vast number of comments we should instead invoke the 3-comment rule:“Whereupon 3 or more comments has been reached in the same thread, the people involved should move the discussion to somewhere more suited for long form discussions”

The “same thread” means basically a chain of linked comments, or comments on the same topic which argues with each other.

somewhere more suited” means anywhere else which is better suited for having longer and more detailed discussions. This often means meeting face to face and having a proper discussion. Though it could also mean calling, video chatting and so forth.

Here is this rule in practice from the authors point of view:

- The reviewer leaves a comment in the code review |- The author responding and disagreeing with the reviewer |- The reviewer is not convinced by the author’s arguments

At this point the author has received 3 comments where no agreement has been made. This is a suitable point to contact the reviewer to find an acceptable solution.

Here is the rule in practice from the reviewer’s point of view:

- The reviewer leaves a comment in the code review |- The author responding and disagreeing with the reviewer |- The reviewer is not convinced by the author’s arguments and leaves one final argument |- The author still believes that his/her original statement is correct and continues to argue

Here we have an author which does not follow the 3-comment rule, so we get a 4th comment. This is fine as the rule states “3 or more”. At this point we invite the author for a discussion elsewhere.

Conclusion

On the surface code reviews can seem like a processing to check that the code is good enough for release, but as we have hopefully seen throughout this post it simply is not that simple. What is considered acceptable differs from person to person, and what is the focus of code reviews is not agreed upon. It is important that we look at our processes and consider where they fit within our development processes. Teams should consider for themselves what it means to have a piece of code pass a code review. To me code reviews is a tool for checking the quality of the code and knowledge sharing. We all want the same things, provide high quality products to our customers, create maintainable and readable applications with processes which don't hold innovation back. I am not going to dictate right or wrong, rather I am encouraging reflection on what the process means to you and how code reviews can help the development process.

For those especially interested, here’s some links to other good resources on the subject:

https://smartbear.com/learn/code-review/best-practices-for-peer-code-review/: Tips and tricks for more effective and better code reviews

https://github.com/google/eng-practices: Google’s own guidelines to code reviews

https://devblogs.microsoft.com/appcenter/how-the-visual-studio-mobile-center-team-does-code-review/ and https://www.michaelagreiler.com/code-reviews-at-microsoft-how-to-code-review-at-a-large-software-company/: Explains how Microsoft conducts their code reviews

https://developers.redhat.com/blog/2019/07/08/10-tips-for-reviewing-code-you-dont-like/: Tips from Redhat how to criticize code you don’t like

https://access.redhat.com/blogs/766093/posts/2802001: Another Redhat blog post, this time talking about code reviews in the context of Python. Still has a fair few tips and tricks

https://martinfowler.com/articles/is-quality-worth-cost.html: Martin Fowler’s two cents on code quality in relation to software

code reviews development process