Fit tensor product splines to longitudinal data
TPSfit.Rd
TPSfit()
is used to fit multidimensional tensor product splines to
longitudinal data with three or more variable of interest prior to
implementation of a clustering algorithm.
Arguments
- data
A longitudinal dataset in long form with multiple variables measured over time.
- time
Name of the time variable (e.g. "Time").
- vars
A character vector of at least 3 variables of interest.
- ID
Name of the subject ID variable.
- knots_time
A numeric vector of knots for spline-fitting the time variable. Must supply knots_time or kt.
- kt
Number of evenly spaced knots for spline-fitting the time variable if knots_time is not given.
- fit_times
Optional vector for times where fitted values will be calculated. If fit_times and n_fit_times are not given, fitted values are calculated at knots.
- n_fit_times
Number of evenly spaced times where fitted values will be calculated if fit_times are not given.
- st
Logical expression indicating whether each variable should be standardized.
Value
An object of class 'TPSfit
' containing the following components:
GAMsfitted
A data frame containing the fitted spline values.GAMscoef
A data frame containing the tensor product spline coefficientsdata_long
A data frame containing data in long format for both time and variableknots
A list of two vectors containing the variable and time knotsindiv_means
A list containing a data frame of individual means for each of the variables of interestGAMs
A list containing the generalized additive models for fitting splines on each individualnsubject
The number of subjects in the datasetIDmatch
A data frame matching the original subject ID and new consecutive ID numberserror_subjects
A vector of individuals that encountered errors in the spline-fitting process
Details
TPSfit()
employs package mgcv
to fit a tensor product splines to each
individual using a generalized additive model. The fitted splines are
two-dimensional, with one dimension being the variable identifier and the
other being time. An adequate number of observed time points are required for
each individual, and the number of knots should be less than the smallest
number of time points. If splines are unable to be fit for an individual, an
error message will be shown, but splines will be fit for remaining
individuals. A vector of identifiers for individuals with errors is included
in the output as error_subjects
, and these subjects are not included in the
output GAMSsfitted
or GAMscoef
.
See also
The mgcv
R package: https://cran.r-project.org/web/packages/mgcv/index.html
Examples
library(tidyr); library(dplyr); library(mgcv)
data(TS.sim)
fitsplines <- TPSfit(TS.sim, vars=c("Var1", "Var2", "Var3"), time="Time",
ID="SubjectID", knots_time=c(0, 91, 182, 273, 365), n_fit_times=10)
fitsplines2 <- TPSfit(TS.sim, vars=c("Var1", "Var2", "Var3"),
time="Time", ID="SubjectID", knots_time=c(0, 91, 182, 273, 365),
fit_times=c(46, 91, 137, 182, 228, 273, 319))