These days I write most of my code in R. All things being equal, I’d really rather not. But I work in biology and most of my colleagues and collaborators use R, so it’s easier to do the same.

One of the things I really miss from python is that every script that I write can be essentially treated like a full fledged “package” is treated in R. That is, if I start writing some ad-hoc code in a script and then decide there’s some part of it I want to re-use later in another script I can just write

import adhoc_script

bar = adhoc_script.load_function()
bar = adhoc_script.useful_processing_function(bar)

and everything works as you’d hope it would. Importantly, it doesn’t matter if I have defined load_function in my current session as importing from adhoc_script.py doesn’t overwrite anything (i.e., adhoc_script gets its own namespace).

Handling of namespaces are one of the areas where R is really awful to work with, but you can get most of this behaviour by bundling the ad-hoc script to be reused into an R package. Unfortunately every time I do this I have to:

  1. Add the scaffolding necessary for an R package. I’ve published multiple packages and I still find this extremely non-trivial. There are whole books written about how best to do this.
  2. Install the package.
  3. Load the package into my current session.

There are tools that make turning scripts into package a bit more automated and less painful, but this is still a long way from what I really want. Conflicting names are still an issue as the expectation of the user when they load the package is that all the functions that were in adhoc_script.R should be in the main namespace. That is, our new script should look like

library(adhoc_srcipt)

bar = load_function()
bar = useful_processing_function(bar)

We can get around this somewhat by not calling library directly and writing adhoc_script::load_function() instead, but the syntax is clunky and it has other limitations. Far more restrictive for my purposes is that if I want to make some minor change to adhoc_script.R and then use that change in my current interactive session or script I need to re-install the package and then reload it. In the end, I’ve been doing what I suspect most people do and use the source command instead. That is, I just run source(adhoc_script.R) and the code is run line for line as if I’d copied and pasted it wherever I’m running source from.

This sort of works, I don’t have to build any special scaffolding to re-use my code and I can just re-run source if I change anything. But I’m still stuck with the name conflict issue and the whole thing is generally inelegant. After a couple of hours of googling and trying different options, I’ve come up with a compromise I’m sort of happy with.

The central idea is to use the sys.source function which does the same thing as source except it runs the code inside a separate “environment” which you are free to specify. Environments in R are a complicated topic in their own right, but for my purposes they’re just a way to run the code loaded via source in a way that it’s walled off from whatever else it is I’m doing while still allowing me to access the bits I need to. I wrote a function to hide all the business of calling sys.source from an environment , which ends up pretty close to what I want. After loading this function, I can do

import(adhoc_script)

bar = adhoc_script$load_function()
bar = adhoc_script$useful_processing_function(bar)

If I change adhoc_script.R I just need to run import(adhoc_function) again. I don’t need to worry about naming conflicts as they’re all nicely fenced off. I can even mimic most of the other nice features of python imports. For example,

import(adhoc_script,as='ahs')

bar = ahs$load_function()
proc_dat = ahs$useful_processing_function
bar = proc_dat(bar)

and if I really must load everything into the current environment

import(adhoc_script,all=TRUE)

bar = load_function()

It’s still far from perfect and I’m not 100% sure I haven’t made some horrible mistake in how I treat environments. But I’m much happier doing this than bumbling along using source which was in practice what I was doing previously. You can download the source code here. I stick it in my ~/.Rprofile file so it is run every time I start R.