The function PLNPCA() produces a collection of models which are instances of object with class PLNPCAfit.
This class comes with a set of methods, some of them being useful for the user:
See the documentation for the methods inherited by PLNfit and the plot() methods for PCA visualization
See also
The function PLNPCA, the class PLNPCAfamily
Super class
PLNfit -> PLNPCAfit
Active bindings
var_parvariational parameters (M, S2) in the rank-q latent space
rankthe dimension of the current model
vcov_modelcharacter: the model used for the residual covariance
nb_paramnumber of parameters in the current PLN model
entropyentropy of the variational distribution
latent_posa matrix: values of the latent position vector (Z) without covariates effects or offset
model_para list with the matrices associated with the estimated parameters of the pPCA model: B (covariates), Sigma (covariance), Omega (precision) and C (loadings)
percent_varthe percent of variance explained by each axis
corr_circlea matrix of correlations to plot the correlation circles
scoresa matrix of scores to plot the individual factor maps (a.k.a. principal components)
rotationa matrix of rotation of the latent space
eigdescription of the eigenvalues, similar to percent_var but for use with external methods
vara list of data frames with PCA results for the variables:
coord(coordinates of the variables),cor(correlation between variables and dimensions),cos2(Cosine of the variables) andcontrib(contributions of the variable to the axes)inda list of data frames with PCA results for the individuals:
coord(coordinates of the individuals),cos2(Cosine of the individuals),contrib(contributions of individuals to an axis inertia) anddist(distance of individuals to the origin).callHacky binding for compatibility with factoextra functions
Methods
Inherited methods
PLNPCAfit$new()
Initialize a PLNPCAfit object.
Uses the shared SVD from control$svdM (computed once in PLNPCAfamily) to set
the starting loadings C and scores M. The regression coefficients B are
initialised by the parent PLNfit constructor (LM or user-provided inception).
Usage
PLNPCAfit$new(rank, responses, covariates, offsets, weights, formula, control)Arguments
rankrank of the PCA (or equivalently, dimension of the latent space)
responsesthe matrix of responses (called Y in the model). Will usually be extracted from the corresponding field in
PLNfamilycovariatesdesign matrix (called X in the model). Will usually be extracted from the corresponding field in
PLNfamilyoffsetsoffset matrix (called O in the model). Will usually be extracted from the corresponding field in
PLNfamilyweightsan optional vector of observation weights to be used in the fitting process.
formulamodel formula used for fitting, extracted from the formula in the upper-level call
controla list for controlling the optimization. See details.
PLNPCAfit$warm_start_from()
Reinitialize parameters for sequential warm-starting from a lower-rank fit.
Fitted loadings C, scores M, variances S, and regression coefficients B from prev_fit
are carried over; new columns are padded using the inception SVD (C) or zeros/0.1 (M/S).
PLNPCAfit$update()
Update a PLNPCAfit object
Usage
PLNPCAfit$update(
B = NA,
Sigma = NA,
Omega = NA,
C = NA,
M = NA,
S2 = NA,
Z = NA,
A = NA,
Ji = NA,
R2 = NA,
monitoring = NA
)Arguments
Bmatrix of regression matrix
Sigmavariance-covariance matrix of the latent variables
Omegaprecision matrix of the latent variables. Inverse of Sigma.
Cmatrix of PCA loadings (in the latent space)
Mmatrix of mean vectors for the variational approximation
S2matrix of variational variances (n × q)
Zmatrix of latent vectors (includes covariates and offset effects)
Amatrix of fitted values
Jivector of variational lower bounds of the log-likelihoods (one value per sample)
R2approximate R^2 goodness-of-fit criterion
monitoringa list with optimization monitoring quantities
PLNPCAfit$optimize()
Call to the C++ optimizer and update of the relevant fields
Arguments
responsesthe matrix of responses (called Y in the model). Will usually be extracted from the corresponding field in
PLNfamilycovariatesdesign matrix (called X in the model). Will usually be extracted from the corresponding field in
PLNfamilyoffsetsoffset matrix (called O in the model). Will usually be extracted from the corresponding field in
PLNfamilyweightsan optional vector of observation weights to be used in the fitting process.
configpart of the
controlargument which configures the optimizer
PLNPCAfit$optimize_vestep()
Result of one call to the VE step of the optimization procedure: optimal variational parameters (M, S) and corresponding log likelihood values for fixed model parameters (C, B). Intended to position new data in the latent space for further use with PCA.
Usage
PLNPCAfit$optimize_vestep(
covariates,
offsets,
responses,
weights = rep(1, self$n),
control = PLNPCA_param(backend = "nlopt")
)Arguments
covariatesdesign matrix (called X in the model). Will usually be extracted from the corresponding field in
PLNfamilyoffsetsoffset matrix (called O in the model). Will usually be extracted from the corresponding field in
PLNfamilyresponsesthe matrix of responses (called Y in the model). Will usually be extracted from the corresponding field in
PLNfamilyweightsan optional vector of observation weights to be used in the fitting process.
controla list for controlling the optimization. See details.
PLNPCAfit$project()
Project new samples into the PCA space using one VE step
Usage
PLNPCAfit$project(newdata, control = PLNPCA_param(), envir = parent.frame())Arguments
newdataA data frame in which to look for variables, offsets and counts with which to predict.
controla list for controlling the optimization. See
PLN()for details.envirEnvironment in which the projection is evaluated
PLNPCAfit$setVisualization()
Compute PCA scores in the latent space and update corresponding fields.
PLNPCAfit$postTreatment()
Update R2, fisher, std_err fields and set up visualization
Arguments
responsesthe matrix of responses (called Y in the model). Will usually be extracted from the corresponding field in
PLNfamilycovariatesdesign matrix (called X in the model). Will usually be extracted from the corresponding field in
PLNfamilyoffsetsoffset matrix (called O in the model). Will usually be extracted from the corresponding field in
PLNfamilyweightsan optional vector of observation weights to be used in the fitting process.
config_posta list for controlling the post-treatments (optional bootstrap, jackknife, R2, etc.). See details
config_optima list for controlling the optimizer (either "nlopt" or "torch" backend). See details
nullModelnull model used for approximate R2 computations. Defaults to a GLM model with same design matrix but not latent variable.
Details
The list of parameters config_post controls the post-treatment processing, with the following entries:
jackknife boolean indicating whether jackknife should be performed to evaluate bias and variance of the model parameters. Default is FALSE.
bootstrap integer indicating the number of bootstrap resamples generated to evaluate the variance of the model parameters. Default is 0 (inactivated).
variational_var boolean indicating whether variational Fisher information matrix should be computed to estimate the variance of the model parameters (highly underestimated). Default is FALSE.
rsquared boolean indicating whether approximation of R2 based on deviance should be computed. Default is TRUE
trace integer for verbosity. should be > 1 to see output in post-treatments
PLNPCAfit$plot_individual_map()
Plot the factorial map of the PCA
Usage
PLNPCAfit$plot_individual_map(
axes = 1:min(2, self$rank),
main = "Individual Factor Map",
plot = TRUE,
cols = "default"
)Arguments
axesnumeric, the axes to use for the plot when map = "individual" or "variable". Default it c(1,min(rank))
maincharacter. A title for the single plot (individual or variable factor map). If NULL (the default), an hopefully appropriate title will be used.
plotlogical. Should the plot be displayed or sent back as ggplot object
colsa character, factor or numeric to define the color associated with the individuals. By default, all individuals receive the default color of the current palette.
Returns
a ggplot2::ggplot graphic
PLNPCAfit$plot_correlation_circle()
Plot the correlation circle of a specified axis for a PLNLDAfit object
Usage
PLNPCAfit$plot_correlation_circle(
axes = 1:min(2, self$rank),
main = "Variable Factor Map",
cols = "default",
plot = TRUE
)Arguments
axesnumeric, the axes to use for the plot when map = "individual" or "variable". Default it c(1,min(rank))
maincharacter. A title for the single plot (individual or variable factor map). If NULL (the default), an hopefully appropriate title will be used.
colsa character, factor or numeric to define the color associated with the variables. By default, all variables receive the default color of the current palette.
plotlogical. Should the plot be displayed or sent back as ggplot object
Returns
a ggplot2::ggplot graphic
PLNPCAfit$plot_PCA()
Plot a summary of the PLNPCAfit object
Usage
PLNPCAfit$plot_PCA(
nb_axes = min(3, self$rank),
ind_cols = "ind_cols",
var_cols = "var_cols",
plot = TRUE
)Arguments
nb_axesscalar: the number of axes to be considered when map = "both". The default is min(3,rank).
ind_colsa character, factor or numeric to define the color associated with the individuals. By default, all variables receive the default color of the current palette.
var_colsa character, factor or numeric to define the color associated with the variables. By default, all variables receive the default color of the current palette.
plotlogical. Should the plot be displayed or sent back as ggplot object
Examples
data(trichoptera)
trichoptera <- prepare_data(trichoptera$Abundance, trichoptera$Covariate)
myPCAs <- PLNPCA(Abundance ~ 1 + offset(log(Offset)), data = trichoptera, ranks = 1:5)
#>
#> Initialization...
#>
#> Adjusting 5 PLN models for PCA analysis.
#> Rank approximation = 1
Rank approximation = 2
Rank approximation = 3
Rank approximation = 4
Rank approximation = 5
#> Post-treatments
#> DONE!
myPCA <- getBestModel(myPCAs)
class(myPCA)
#> [1] "PLNPCAfit" "PLNfit" "PCA" "R6"
print(myPCA)
#> Poisson Lognormal with rank constrained for PCA (rank = 3)
#> ==================================================================
#> nb_param loglik BIC AIC ICL
#> 65 -640.365 -766.849 -705.365 -825.034
#> ==================================================================
#> * Useful fields
#> $model_par, $latent, $latent_pos, $var_par, $optim_par
#> $loglik, $BIC, $ICL, $loglik_vec, $nb_param, $criteria
#> * Useful S3 methods
#> print(), coef(), sigma(), vcov(), fitted()
#> predict(), predict_cond(), standard_error()
#> * Additional fields for PCA
#> $percent_var, $corr_circle, $scores, $rotation, $eig, $var, $ind
#> * Additional S3 methods for PCA
#> plot.PLNPCAfit()
