rOpenSci | Experiences as a first time rOpenSci package reviewer

Experiences as a first time rOpenSci package reviewer

It all started January 26th this year when I signed up to volunteer as a reviewer for R packages submitted to rOpenSci. My main motivation for wanting to volunteer was to learn something new and to contribute to the R open source community. If you are wondering why the people behind rOpenSci are doing this, you can read How rOpenSci uses Code Review to Promote Reproducible Science.

Three months later I was contacted by Maëlle Salmon asking whether I was interested in reviewing the R package patentsview for rOpenSci. And yes, I was! To be honest I was a little bit thrilled.

The packages are submitted for review to rOpenSci via an issue to their GitHub repository and also the reviews happen there. So you can check out all previous package submissions and reviews. With all the information you get from rOpenSci and also the help from the editor it is straightforward to do the package review. Before I started I read the reviewer guides (links below) and checked out a few of the existing reviews. I installed the package patentsview from GitHub and also downloaded the source code so I could check out how it was implemented.

I started by testing core functionality of the package by running the examples that were mentioned in the README of the package. I think this is a good starting point because you get a feeling of what the author wants to achieve with the package. Later on I came up with my own queries (side note: this R package interacts with an API from which you can query patents). During the process I used to switch between writing queries like a normal user of the package would do and checking the code. When I saw something in the code that wasn’t quite clear to me or looked wrong I went back to writing new queries to check whether the behavior of the methods was as expected.

With this approach I was able to give feedback to the package author which led to the inclusion of an additional unit test, a helper function that makes the package easier to use, clarification of an error message and an improved documentation. You can find the review I did here.

There are several R packages that helped me get started with my review, e.g. devtools and goodpractice. These packages can also help you when you start writing your own packages. An example for a very useful method is devtools::spell_check(), which performs a spell check on the package description and on manual pages. At the beginning I had an issue with goodpractice::gp() but Maelle Salmon (the editor) helped me resolve it.

In the rest of this article you can read what I gained personally from doing a review.

🔗 Contributing to the open source community

When people think about contributing to the open source community, the first thought is about creating a new R package or contributing to one of the major packages out there. But not everyone has the resources (e.g. time) to do so. You also don’t have awesome ideas every other day which can immediately be implemented into new R packages to be used by the community. Besides contributing with code there are also lots of other things than can be useful for other R users, for example writing blog posts about problems you solved, speaking at meetups or reviewing code to help improve it. What I like much about reviewing code is that people see things differently and have other experiences. As a reviewer, you see a new package from the user’s perspective which can be hard for the programmer themselves. Having someone else review your code helps finding things that are missing because they seem obvious to the package author or detect code pieces that require more testing. I had a great feeling when I finished the review, since I had helped improve an already amazing R package a little bit more.

🔗 Reviewing helps improve your own coding style

When I write R code I usually try to do it in the best way possible. Google’s R Style Guide is a good start to get used to coding best practice in R and I also enjoyed reading Programming Best Practices Tidbits. So normally when I think some piece of code can be improved (with respect to speed, readability or memory usage) I check online whether I can find a better solution. Often you just don’t think something can be improved because you always did it in a certain way or the last time you checked there was no better solution. This is when it helps to follow other people’s code. I do this by reading their blogs, following many R users on Twitter and checking their GitHub account. Reviewing an R package also helped me a great deal with getting new ideas because I checked each function a lot more carefully than when I read blog posts. In my opinion, good code does not only use the best package for each problem but also the small details are well implemented. One thing I used to do wrong for a long time was filling of data.frames until I found a better (much faster) solution on stackoverflow. And with respect to this you can learn a lot from someone else’s code. What I found really cool in the package I reviewed was the usage of small helper functions (see utils.R). Functions like paste0_stop and paste0_message make the rest of the code a lot easier to read.

🔗 Good start for writing your own packages

When reviewing an R package, you check the code like a really observant user. I noticed many things that you usually don’t care about when using an R package, like comments, how helpful the documentation and the examples are, and also how well unit tests cover the code. I think that reviewing a few good packages can prepare you very well for writing your own packages.

🔗 Do you want to contribute to rOpenSci yourself?

If I motivated you to become an rOpenSci reviewer, please sign up! Here is a list of useful things if you want to become an rOpenSci reviewer like me.

If you are generally interested in either submitting or reviewing an R package, I would like to invite you to the Community Call on rOpenSci software review and onboarding.