I often find myself wondering about the best way to present R code, and output.
The issue is in R itself, in the console, we will see this type of thing:
> X <- rnorm(100)> Y <- X + rnorm(100)> lm(Y ~ X) %>% summary()Residuals: Min 1Q Median 3Q Max -3.0024 -0.6662 0.0044 0.7071 2.2079 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.0247 0.1073 -0.230 0.818 X 0.9865 0.1091 9.041 1.46e-14 ***---> lm(Y ~ -1 + X) %>% summary()Residuals: Min 1Q Median 3Q Max -3.02849 -0.68970 -0.02017 0.68257 2.18443 Coefficients: Estimate Std. Error t value Pr(>|t|) X 0.9855 0.1085 9.082 1.1e-14 ***So it seems perfectly natural to present this code and output exactly as above. It has the advantage of clearly marking code (prefaced with > as per the R console) from the output of the code.
But the problem with that is, for a user that wants to actually try the code out for themselves, they have to manually remove the > from each line. I have had a number of my posts edited by other users to remove them. So one alternative is to present it like this:
X <- rnorm(100)Y <- X + rnorm(100)lm(Y ~ X) %>% summary()Residuals: Min 1Q Median 3Q Max -3.0024 -0.6662 0.0044 0.7071 2.2079 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.0247 0.1073 -0.230 0.818 X 0.9865 0.1091 9.041 1.46e-14 ***---lm(Y ~ -1 + X) %>% summary()Residuals: Min 1Q Median 3Q Max -3.02849 -0.68970 -0.02017 0.68257 2.18443 Coefficients: Estimate Std. Error t value Pr(>|t|) X 0.9855 0.1085 9.082 1.1e-14 ***This might be fine for experienced R users, but it blurs the distinction between code and output. So we might break it up like this:
X <- rnorm(100)Y <- X + rnorm(100)lm(Y ~ X) %>% summary()which produces
Residuals: Min 1Q Median 3Q Max -3.0024 -0.6662 0.0044 0.7071 2.2079 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.0247 0.1073 -0.230 0.818 X 0.9865 0.1091 9.041 1.46e-14 ***---and then we fit the model:
lm(Y ~ -1 + X) %>% summary()which produces :
Residuals: Min 1Q Median 3Q Max -3.02849 -0.68970 -0.02017 0.68257 2.18443 Coefficients: Estimate Std. Error t value Pr(>|t|) X 0.9855 0.1085 9.082 1.1e-14 ***which takes more time to write and makes the post longer and more verbose.
Maybe I am being a little too pedantic, but I just wondered if others have had similar thoughts or if there is an alternative, or indeed if one of the above approaches is considered better in general ?