Code Review

commity members

  • Fred B.*, Pascal L., Olivier D., Arnaud B.


  • Test the 2 proposed workflows (review board and github)

Reasons for code review

  • We want at least 2 people to read all code. That means we need a reviewer.
  • This helps to find better solutions to problems.
  • This helps to train people on our tools and framework.
  • This gives better code and less bugs in the end (everybody makes mistakes).

Proposed Policy

There are 2 proposals we want to test:

  • GitHub fork/merge/pull request for new Pylearn
    • When someone asks for a pull request, someone else does the review.
    • Everyone should work within their own Pylearn fork, and submit pull requests to merge their code into the master Pylearn repository.
  • Review Board post-commit for Theano
    • For each commit message, if the author does not want this commit to be reviewed, say it in the message
      • Useful for Official repo when we commit something that is disabled by default and we want to split it in many commits or start using it even if not fully tested.
      • Reviewer should still check that it is not enabled by default and when enabled it prints a warning (once per job execution).
    • We check all commits to Theano and Jobman (official tools).

    • If the official reviewer is not the right person for the task (gpu code, ...) he has the responsability to find someone else (ask people in the lab, mailing-list, ...).

    • The official reviewer should:
      • Review all code (see the checklist) and ask an expert when needed.
      • Commit easy fixes and ask the original committer to review them (non-trivial fixes are the responsibility of the original committer).
    • The official reviewer is chosen among the Theano users in the lab with commit rights.

    • In this test, we ask the official reviewer to review all commits from one day.

    • We will set up a list of experts by domain (gpu, optimization, algo,...).

    • After this test, we may want to make it longer (1 week?)
      • If somebody breaks the build bot, he becomes the reviewer for the next week/days.
      • Maximum of one week by month.
      • If there is a big week or a rush that includes everybody in the lab, we can change more frequently.
      • If a commit has problems, the original reviewer should follow-up on it.

Other general comments: - We should never be the official reviewer of our own code. When this happens,

ask the reviewer for the next day to take care of it.
  • Experimental code (not in Theano) may be tagged as not being reviewed (in commit message).

Checklist for review

  • Are there tests and do they cover all cases?

  • Is there documentation in the code file?
    • Should public (HTML) documentation be updated as well?
  • Are additions well integrated into our framework?

  • Is the code placed in the right files, and the right place in those files?

  • Try not to duplicate code.

  • Is the code clear and easy to understand?

  • Are there comments describing what is being done?

  • Answer potential questions by the committer (in the code). This can also help to train people.

  • Check for typos.

  • No debug code (print, breakpoints, ...).

  • If commit message says not to review, check that the code is disabled by default and that when enabled prints a warning.

  • Check for conformity to our coding guidelines.

Some system that we checked

What we could want from our code review

  • integrate with our ticket system?
    • Should we keep our current ticket system?
  • work with mercurial, git?
    • how branch/merge are handled
  • check each commit of theano/pylearn

  • check experimental repository code when asked

  • how show diff? diff as vimdiff? patch?

  • syntax highlight

  • If we commit something that is disabled by default and not fully working, we can say it in the commit message to have a faster review(only check that by default it is disabled). Then we should say in the commit message when it is ready for a full review.

  • Review should be done by everybody.

  • Who choose the reviewer(random, commiter)? pool of reviewers? pool level 1,2,3 where 1 is everybody with commit right. pool for specific topic(gpu, ML algo, ...)?

Type of code review

  • Formal review - Many person review together each line of the program.
  • Over-the-shoulder – One developer looks over the author’s shoulder as the latter walks through the code.
  • Email pass-around – Source code management system emails code to reviewers automatically after checkin is made.
  • Pair Programming – Two authors develop code together at the same workstation, such is common in Extreme Programming.
  • Tool-assisted code review – Authors and reviewers use specialized tools designed for peer code review.
  • Alternative: Test-Driven development
  • Automatic review: use tool as pylint, pyflakes, pychecker. Don’t check everything.

We seam to do Over-the-shoulder, email and variant of pair programming from time to time. Some people read rapidly the commit of Theano and Pylearn.