Posts by Gavin L. Simpson
Author: Gavin L. Simpson
Extrapolating with B splines and GAMs
Feed: R-bloggers. Author: Gavin L. Simpson.
An issue that often crops up when modelling with generlaized additive models (GAMs), especially with time series or spatial data, is how to extrapolate beyond the range of the data used to train the model? The issue arises because GAMs use splines to learn from the data using basis functions. The splines themselves are built from basis functions that are typically setup in terms of the data used to fit the model. If there are no basis functions beyond the range of the input data, what exactly is being used if we want ... Read More
gratia 0.4.1 released
Feed: R-bloggers. Author: Gavin L. Simpson.
After a slight snafu related to the 1.0.0 release of dplyr, a new version of gratia is out and available on CRAN. This release brings a number of new features, including differences of smooths, partial residuals on partial plots of univariate smooths, and a number of utility functions, while under the hood gratia works for a wider range of models that can be fitted by mgcv. Partial residuals The draw() method for gam() and related models produces partial effects plots. plot.gam() has long had the ability to add partial residuals to partial plots ... Read More
Rendering your README with GitHub Actions
Feed: R-bloggers. Author: Gavin L. Simpson. There’s one thing that has bugged me for a while about developing R packages. We have all these nice, modern tools we have for tracking our code, producing web sites from the roxygen documentation, an so on. Yet for every code commit I make to the master branch of a package repo, there’s often two or more additional steps I need to take to keep the package README.md and pkgdown site in sync with the code. Don’t get me wrong; it’s amazing that we have these tools available to help users get to grips ... Read More
Pivoting tidily
Feed: R-bloggers. Author: Gavin L. Simpson. One of the fun bits of my job is that I have actual time dedicated to helping colleagues and grad students with statistical or computational problems. Recently I’ve been helping one of our Lab Instructors with some data that from their Plant Physiology Lab course. Whilst I was writing some R code to import the raw data for the lab from an Excel sheet, it occurred to me that this would be a good excuse to look at the new pivot_longer() and pivot_wider() functions from the tidyr package. In this post I show how ... Read More
radian: a modern console for R
Feed: R-bloggers. Author: Gavin L. Simpson. Whenever I’m developing R code or writing data wrangling or analysis scripts for research projects that I work on I use Emacs and its add-on package Emacs Speaks Statistics (ESS). I’ve done so for nigh on a couple of decades now, ever since I switched full time to running Linux as my daily OS. For years this has served me well, though I wouldn’t call myself an Emacs expert; not even close! With a bit of help from some R Core coding standards document I got indentation working how I like it, I learned ... Read More
Tibbles, checking examples, & character encodings
Feed: R-bloggers. Author: Gavin L. Simpson. Recently I’ve been preparing my gratia package for submission to CRAN. During my pre-flight testing I noticed an issue under Windows checking the examples in the package against the reference output I generated on linux. In the latest release of the tibble package, the way tibbles are printed has changed subtly and in a way that leads to cross-platform differences. As I write this, tibbles with more than a set number of rows are printed in a truncated form, showing only the first 10 rows of data. In such cases, a final line is ... Read More
Introducing gratia
Feed: R-bloggers. Author: Gavin L. Simpson. I use generalized additive models (GAMs) in my research work. I use them a lot! Simon Wood’s mgcv package is an excellent set of software for specifying, fitting, and visualizing GAMs for very large data sets. Despite recently dabbling with brms, mgcv is still my go-to GAM package. The only down-side to mgcv is that it is not very tidy-aware and the ggplot-verse may as well not exist as far as it is concerned. This in itself is no bad thing, though as someone who uses mgcv a lot but also prefers to do ... Read More
Fitting GAMs with brms: part 1
Feed: R-bloggers. Author: Gavin L. Simpson. Regular readers will know that I have a somewhat unhealthy relationship with GAMs and the mgcv package. I use these models all the time in my research but recently we’ve been hitting the limits of the range of models that mgcv can fit. So I’ve been looking into alternative ways to fit the GAMs I want to fit but which can handle the kinds of data or distributions that have been cropping up in our work. The brms package (Bürkner, 2017) is an excellent resource for modellers, providing a high-level R front end to ... Read More
Comparing smooths in factor-smooth interactions II
Feed: R-bloggers. Author: Gavin L. Simpson. In a previous post I looked at an approach for computing the differences between smooths estimated as part of a factor-smooth interaction using s()’s by argument. When a common-or-garden factor variable is passed to by, gam() estimates a separate smooth for each level of the by factor. Using the (Xp) matrix approach, we previously saw that we can post-process the model to generate estimates for pairwise differences of smooths. However, the by variable approach of estimating a separate smooth for each level of the factor my be quite inefficient in terms of degrees of ... Read More
First steps with MRF smooths
Feed: R-bloggers. Author: Gavin L. Simpson. One of the specialist smoother types in the mgcv package is the Markov Random Field (MFR) smooth. This smoother essentially allows you to model spatial data with an intrinsic Gaussian Markov random field (GMRF). GRMFs are often used for spatial data measured over discrete spatial regions. MRFs are quite flexible as you can think about them as representing an undirected graph whose nodes are your samples and the connections between the nodes are specified via a neighbourhood structure. I’ve become interested in using these MRF smooths to include information about relationships between species. However, ... Read More
Recent Comments