Skip to contents

FKM.glm() fits a generalized linear model (GLM) using clusters output from cluster.fitted() or cluster.coefs() as predictors, along with additional covariates.

Usage

FKM.glm(FKM_object, data, y, covariates, refclus = 1, family = "gaussian", ...)

Arguments

FKM_object

An object of class FKM.TPS output from cluster.fitted() or cluster.coefs().

data

A data frame with the same subjects used for spline-fitting and clustering that includes an outcome variable of interest and optional covariates.

y

Name of the outcome variable (e.g. y="Death")

covariates

A vector of covariates of interest to be included in the model.

refclus

Numeric identification of the cluster to be used as the reference cluster. Default is cluster 1 (refclus=1). Use refclus=0 to identify the noise cluster as the reference cluster.

family

A description of the error distribution and link function to be used in the model.

...

Additional arguments for the glm function.

Value

An object of class FKM.glm containing the following components:

  • FKM_object The inputted object of class FKM.TPS.

  • model_data A data frame containing the variables used in the model, including degree of cluster membership.

  • formula The formula used in the model.

  • family The family call used in the model.

  • model_full The GLM model using clusters as predictors and any additional covariates of interest.

  • model_noclusters The GLM model using the covariates of interest but no clusters.

  • anova ANOVA comparing the models with and without clusters as predictors.

  • anova_pval P-value for the ANOVA comparing the models with and without clusters as predictors.

Details

FKM.glm() applies the glm function to fit a generalized linear model using clusters as predictors. Clusters are obtained using cluster.fitted() or cluster.coefs(), and the output object of class FKM.TPS is input into the FKM.glm() function, along with a dataset containing the output variable and additional covariates of interest. Clusters are included using the "partial assignment" method that employs the degree of cluster membership for each individual to account for uncertainty in the cluster assignment.

See also

The glm function: glm

Examples

library(tidyr); library(dplyr); library(mgcv); library(fclust)
#> Loading required package: nlme
#> 
#> Attaching package: 'nlme'
#> The following object is masked from 'package:dplyr':
#> 
#>     collapse
#> This is mgcv 1.8-36. For overview type 'help("mgcv-package")'.
data(TS.sim)

fitsplines <- TPSfit(TS.sim, vars=c("Var1", "Var2", "Var3"), time="Time",
     ID="SubjectID", knots_time=c(0, 91, 182, 273, 365), n_fit_times=10)

clusters1 <- cluster.fitted(fitsplines, k=3, m=1.3, seed=12345, RS=5, noise=TRUE)

model <- FKM.glm(clusters1, TS.sim, y="outcome", covariates=c("x1", "x2"),
family="binomial")
summary(model)
#> Full model:
#> Formula (f1):  outcome ~ Clus2 + Clus3 + Noise + x1 + x2 
#> Family: binomial 
#> 
#> Call:
#> glm(formula = f1, family = family, data = data3)
#> 
#> Deviance Residuals: 
#>      Min        1Q    Median        3Q       Max  
#> -1.98641  -0.22509  -0.02046   0.14030   2.13265  
#> 
#> Coefficients:
#>             Estimate Std. Error z value Pr(>|z|)    
#> (Intercept) 12.91873    2.58402   4.999 5.75e-07 ***
#> Clus2       -2.51549    1.00954  -2.492   0.0127 *  
#> Clus3       -1.27507    0.82022  -1.555   0.1201    
#> Noise        0.99726    1.59330   0.626   0.5314    
#> x1           0.78247    0.68891   1.136   0.2560    
#> x2          -0.28124    0.05409  -5.200 1.99e-07 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> (Dispersion parameter for binomial family taken to be 1)
#> 
#>     Null deviance: 201.065  on 149  degrees of freedom
#> Residual deviance:  63.197  on 144  degrees of freedom
#> AIC: 75.197
#> 
#> Number of Fisher Scoring iterations: 7
#> 
#> 
#> ANOVA chi-square p-value for significance of clusters in model:
#> 0.02512377
model$anova_pval
#> [1] 0.02512377