Package 'Rnest'

Title: Next Eigenvalue Sufficiency Test
Description: Determine the number of dimensions to retain in exploratory factor analysis. The main function, nest(), returns the solution and the plot(nest()) returns a plot.
Authors: P.-O. Caron [aut, cre, cph]
Maintainer: P.-O. Caron <[email protected]>
License: MIT + file LICENSE
Version: 1.0
Built: 2025-01-24 20:33:12 UTC
Source: https://github.com/quantmeth/rnest

Help Index


A list of seven correlation matrices.

Description

A list of seven correlation matrices.

Usage

achim

Format

A list of correlation matrices.

Source

https://github.com/quantmeth

References

Achim, A. (personal communication).


A correlation matrix composed of six factors.

Description

A correlation matrix composed of 18 items based on six factors. Four have more than three variables, three variables have crossloadings (items 6, 7 and 13), two are doublets factors (items 13-14, 15-16), and there is two unique variables (17 and 18). Loadings range between .40 and .80.

Usage

achim24

Format

A 18 by 18 correlation matrix.

Source

https://github.com/quantmeth

References

Achim, A. (2024, April 4). Signal cancellation factor analysis. PsyArXiv, 1–13. doi:10.31234/osf.io/h7qwg


Bartlett Sphericity Test

Description

BartlettSphericity tests if variables are orthogonal.

Usage

BartlettSphericity(R, n)

Arguments

R

the correlation matrix.

n

the sample size.

Value

The χ2\chi^2 test of the correlation matrix R with sample size n.

Author(s)

André Achim (Matlab)

P.-O. Caron (R)

References

Bartlett, M. S. (1937). Properties of sufficiency and statistical tests. Proceedings of the Royal Statistical Society, Series A, 160, 268–282

Examples

BartlettSphericity(ex_4factors_corr, 42)

A list of three correlation matrices.

Description

A list of three correlation matrices.

Usage

briggs_maccallum2003

Format

A a list of three correlation matrices.

Source

https://github.com/quantmeth

References

Briggs, N. E., & MacCallum, R. C. (2003). Recovery of weak common factors by Maximum likelihood and ordinary least squares estimation. Multivariate Nehavioral Research, 38(1), 25–56. doi:10.1207/S15327906MBR3801_2


A list of six correlation matrices composed of nine variables with three factors and different levels of correlations between factors.

Description

A list of six correlation matrices composed of nine variables with three factors and different levels of correlations between factors.

Usage

caron2016

Format

A list of six 9 x 9 correlation matrices.

Source

https://github.com/quantmeth

References

Caron, P.-O. (2016). A Monte Carlo examination of the broken-stick distribution to identify components to retain in principal component analysis. Journal of Statistical Computation and Simulation, 86(12), 2405-2410. doi:10.1080/00949655.2015.1112390


A list of 15 correlation matrices composed of nine variables with three factors and different levels of correlations between factors.

Description

A list of 15 correlation matrices composed of nine variables with three factors and different levels of correlations between factors.

Usage

caron2019

Format

A list of 15 9 x 9 correlation matrices.

Source

https://github.com/quantmeth

References

Caron, P.-O. (2019). Minimum average partial correlation and parallel analysis : The influence of oblique structures. Communications in Statistics - Simulation and Computation, 48(7), 2110-2117. doi:10.1080/03610918.2018.1433843


A list containing 120 correlation matrices.

Description

A list containing 120 24×2424 \times 24 correlation matrices (R) built to represent different factor structures. Details are found in the 'cormat.l' data.

Usage

cormat

Format

A a list of 120 correlation matrices.

Source

https://github.com/quantmeth

References

Caron, P.-O. (2025). A comparison of the Next Eigenvalue Sufficiency Test to other stopping rules for the number of factors in factor analysis. Educational and Psychological Measurement. doi:10.1177/00131644241308528


A list containing 120 lists of correlation matrices and their underlying characteristics.

Description

A list containing 120 lists of 24×2424 \times 24 correlation matrices (R) built to represent different factor structures. Different levels of loadings (delta, .4, .5, .6, .7, .8), correlation between factors (corrfact, .0, .1, .2 .3), and. number of factors (nfactors, 1:8) are used. The list contained matrice (R), and their underlying characteristics (delta, corrfact, and nfactors).

Usage

cormat.l

Format

A list containing 120 matrices.

Source

https://github.com/quantmeth

References

See Caron, P.-O. (2025). A comparison of the Next Eigenvalue Sufficiency Test to other stopping rules for the number of factors in factor analysis. Educational and Psychological Measurement. doi:10.1177/00131644241308528


Compute covariance or correlation matrix with treatments for clusters and missing values

Description

Compute covariance or correlation matrix with treatments for clusters and missing values

Usage

cor_nest(.data, ..., cluster = NULL, missing = "fiml", pvalue = FALSE)

cov_nest(.data, ..., cluster = NULL, missing = "fiml", pvalue = FALSE)

Arguments

.data

a data frame, a numeric matrix.

...

further arguments.

cluster

a variable name defining the clusters in a two-level dataset in the data frame.

missing

treatment to deal with missing values. Options are "listwise" or "pairwise". Default if "fiml".

pvalue

an argument to indicate if pp-values are required.

Value

A list of class "covnest"

Examples

cov_nest(airquality)

Full Information Maximum Likelihood (FIML) correlation or covariance matrix

Description

Full Information Maximum Likelihood (FIML) correlation or covariance matrix

Usage

covFIML(data, tol = 1e-6, maxiter = 1000, pvalue = FALSE)

corFIML(data, tol = 1e-6, maxiter = 1000, pvalue = FALSE)

Arguments

data

a data frame of rdata matrix.

tol

tolerance.

maxiter

maximum number of iterations.

pvalue

an argument to indicate if pp-values are required.

Value

A list containing the means, th correlation or covariance matrix, and optionnaly the degree of freedom and the p-values.

Note

A not so efficient function. See ?cor_nest instead.

Examples

covFIML(airquality)

Empirical Kaiser Criterion (EKC)

Description

Empirical Kaiser Criterion (EKC)

Usage

EKC(.data = NULL, n = NULL, nv = NULL, lowest.eig = 1, ...)

Arguments

.data

a data frame, a numeric matrix, covariance matrix or correlation matrix from which to determine the number of factors.

n

the number of cases (subjects, participants, or units) if a covariance matrix is supplied in .data.

nv

the number of variables if the critical values are required.

lowest.eig

minimal eigenvalues to retain. Default is Kaiser's suggestion of 1.

...

further argument for cor_nest().

Value

The number of factors to retain or the crititical eigenvalues.

References

Braeken, J., & van Assen, M. A. L. M. (2017). An empirical Kaiser criterion. Psychological Methods, 22(3), 450–466. doi:10.1037/met0000074

Examples

EKC(ex_4factors_corr, n = 42)

A correlation matrix composed of 2 factors.

Description

A correlation matrix composed of 10 items based on 2 factors with 5 variables each and loadings equals to .80.

Usage

ex_2factors

Format

A 10 by 10 correlation matrix.

Source

https://github.com/quantmeth

References

Caron, P.-O. (2025). Rnest: An R package for the Next Eigenvalue Sufficiency Test. https://github.com/quantmeth/Rnest


A correlation matrix composed of two factors, a doublet factor and a unique variable.

Description

A correlation matrix composed of 10 items based on two main factors among which there is two cross-loadings. There is also a doublet factors and an unique variable.

Usage

ex_3factors_doub_unique

Format

A 10 by 10 correlation matrix.

Source

https://github.com/quantmeth

References

Achim, A. (personal communication).


A correlation matrix composed of 4 correlated factors.

Description

A correlation matrix composed of 12 items based on 4 factors with 3 variables each. Loadings equals to .9, .9, and .3. Factors 1 and 2, and factors 3 and 4 are correlated at .7.

Usage

ex_4factors_corr

Format

A 12 by 12 correlation matrix.

Source

https://github.com/quantmeth

References

Achim, A (personal communication).


A correlation matrix from chapter 19 Explorer of Méthodes quantitatives avec R (MQR).

Description

A population correlation matrix composed of 6 items from a two factor stucture. Factor 1 is based on items 1 to 3 and 6, and Factor 2 is based on items 4 to 6.

Usage

ex_mqr

Format

A 6 by 6 correlation matrix.

Source

https://github.com/quantmeth

References

Caron, P.-O. (2024). Méthodes quantitatives avec R. https://mqr.teluq.ca


Simplify the the generation from a multivariate normal distributions.

Description

Speed up the use of MASS::mvrnorm.

Usage

genr8(n = 1, R = diag(10), mean = rep(0, ncol(R)), ...)

Arguments

n

the number of samples required.

R

a positive-definite symmetric matrix specifying the covariance matrix of the variables.

mean

an optinal vector giving the means of the variables. Default is 0.

...

arguments for MASS::mvrnorm(), such as tol, empirical, and EISPACK.

Value

A data frame of size n by ncol(R).

Examples

set.seed(19)
R <- caron2016$mat1
mydata <- genr8(n = nrow(R)+1, R = R, empirical = TRUE)
round(mydata, 2)
round(cov(mydata), 2)

Ledermann bound.

Description

Returns the maximum number of latent factors in a factor analysis model.

Usage

Ledermann(p)

Arguments

p

The number of variables.

Value

The Ledermann bound.

Author(s)

André Achim (Matlab)

P.-O. Caron (R)

References

Ledermann, W. (1937). On the rank of reduced correlation matrices in multiple factor analysis. Psychometrika, 2, 85–93.

Examples

Ledermann(ncol(ex_4factors_corr))

Print Loadings in NEST

Description

Print Loadings in NEST

Usage

loadings(x, nfactors = x$nfactors, method = x$method, ...)

Arguments

x

an object of class "nest".

nfactors

the number of factors to retains.

method

a method used to compute loadings and uniquenesses.

...

further arguments to methods in "nest" or the stats::loadings function.

Value

A p×kp \times k matrix containing loadings where pp is the number of variables and kk is the number of factors (nfactors).

Note

See stats::loadings for the original documentation.

Examples

results <- nest(ex_2factors, n = 100)
loadings(results)

Minimum average partial correlation (MAP)

Description

Minimum average partial correlation (MAP)

Usage

MAP(.data, ...)

Arguments

.data

a data frame, a numeric matrix, covariance matrix or correlation matrix from which to determine the number of factors.

...

further argument for cor_nest().

Value

The number of factors to retain.

References

Velicer, W. F. (1976). Determining the number of components from the matrix of partial correlations. Psychometrika, 41(3), 321-327. doi:10.1007/BF02293557

Examples

D <- genr8(n = 42, R = ex_4factors_corr)
MAP(D)

A correlation matrix from Meek-Bouchard.

Description

A sample correlation matrix composed of 44 items.

Usage

meek_bouchard

Format

A 44 by 44 correlation matrix.

Source

https://github.com/quantmeth

References

Meek-Bouchard, C. (personal communication).


Next Eigenvalue Sufficiency Test (NEST)

Description

nest is used to identify the number of factors to retain in exploratory factor analysis.

Usage

nest(
  .data,
  ...,
  n = NULL,
  nreps = 1000,
  alpha = 0.05,
  max.fact = TRUE,
  method = "ml",
  na.action = "fiml"
)

Arguments

.data

a data frame, a numeric matrix, covariance matrix or correlation matrix from which to determine the number of factors.

...

arguments for method that can be supplied. See details.

n

the number of cases (subjects, participants, or units) if a covariance matrix is supplied in .data.

nreps

the number of replications to simulate. Default is 1000.

alpha

a vector of type I error rates or (1-alpha)*100% confidence intervals. Default is .05.

max.fact

an optional maximum number of factor to extract. Default is TRUE, so maximum number possible.

method

a method used to compute loadings and uniquenesses. Four methods are implemented in Rnest : maximum likelihood method = "ml" (default), regularized common factor analysis method = "rcfa", minimum rank factor analysis method = "mrfa", and principal axis factoring method = "paf". See details for custom methods.

na.action

how should missing data be removed. "na.omit" removes complete rows with at least one single missing data. "fiml" uses full information maximum likelihood to compute the correlation matrix. Other options are "everything", "all.obs", "complete.obs", "na.or.complete", or "pairwise.complete.obs". Default is "fiml".

Details

The Next Eigenvalues Sufficiency Test (NEST) is an extension of parallel analysis by adding a sequential hypothesis testing procedure for every k=1,...,pk = 1, ..., p factor until the hypothesis is not rejected.

At k=1k = 1, NEST and parallel analysis are identical. Both use an Identity matrix as the correlation matrix. Once the first hypothesis is rejected, NEST uses a correlation matrix based on the loadings and uniquenesses of the kthk^{th} factorial structure. NEST then resamples the eigenvalues of this new correlation matrix. NEST stops when the kthk^{th} eigenvalues is within the confidence interval.

There is four method already implemented in nest to extract loadings and uniquenesses: maximum likelihood ("ml"; default), principal axis factoring ("paf"), regularized common factor analysis method = "rcfa", and minimum rank factor analysis ("mrfa"). The functions use as arguments: covmat, n, factors, and ... (supplementary arguments passed by nest). They return loadings and uniquenesses. Any other user-defined functions can be used as long as it is programmed likewise.

Value

nest() returns an object of class nest. The functions summary and plot are used to obtain and show a summary of the results.

An object of class nest is a list containing the following components:

  • nfactors - The number of factors to retains (one by alpha).

  • cor - The supplied correlation matrix.

  • n - The number of cases (subjects, participants, or units).

  • values - The eigenvalues of the supplied correlation matrix.

  • alpha - The type I error rate.

  • method - The method used to compute loadings and uniquenesses.

  • nreps - The number of replications used.

  • prob - Probabilities of each factor.

  • Eig - A list of simulated eigenvalues.

Generic function

plot.nest Scree plot of the eigenvalues and the simulated confidence intervals for alpha.

loadings Extract loadings. It does not overwrite stat::loadings.

summary.nest Summary statistics for the number of factors.

Author(s)

P.-O. Caron

References

Achim, A. (2017). Testing the number of required dimensions in exploratory factor analysis. The Quantitative Methods for Psychology, 13(1), 64-74. doi:10.20982/tqmp.13.1.p064

Examples

nest(ex_2factors, n = 100)
nest(mtcars)

Parallel analysis

Description

Parallel analysis

Usage

pa(
  data = NULL,
  n = NULL,
  nv = NULL,
  nreps = 1000,
  alpha = 0.05,
  crit = NULL,
  ...
)

Arguments

data

a data.frame.

n

the number of subjects.

nv

the number of variables.

nreps

the number of replications.

alpha

type I error rate.

crit

critical values to compare the eigenvalues.

...

other arguments

Value

nfactors (if data is supplied) and sampled eigenvalues

References

Horn, J. L. (1965). A rationale and test for the number of factors in factor analysis. Psychometrika, 30(2), 179–185. doi:10.1007/BF02289447

Examples

pa(ex_2factors, n = 42)
pa(n = 10, nv = 2, nreps = 100)

Plot results of Next Eigenvalues Sufficiency Test (NEST)

Description

Scree plot of the eigenvalues and the (1-alpha)*100% confidence intervals derived from the resampled eigenvalues supplied to nest.

Usage

## S3 method for class 'nest'
plot(x, pa = FALSE, ...)

Arguments

x

an object of class "nest".

pa

show results of Parallel Analysis.

...

further arguments for other methods, ignored for "nest".

Value

A ggplot output.

Note

This function is more interesting with many alpha values.

Examples

results <- nest(ex_2factors, n = 100, alpha = c(.01, .05, .01))
plot(results)
# Return the data used to produce the plot
df <- plot(results)$data

Print results of NEST

Description

Print the number of factors to retain according to confidence levels.

Usage

## S3 method for class 'nest'
print(x, ...)

Arguments

x

an object of class "nest".

...

further arguments for other methods, ignored for "nest".

Value

No return value, called for side effects.

Examples

results <- nest(ex_2factors, n = 100)
print(results)

Remove unique variables

Description

Remove unique variables

Usage

remove_unique(.data, ..., alpha = 0.05)

Arguments

.data

a data frame, a numeric matrix, covariance matrix or correlation matrix from which to determine the number of factors.

...

further arguments for unique_variable() and cor_nest().

alpha

type I error rate.

Value

A list containing the unique variables and a data frame containing their probabilities and the .data with the unique variable removed.

Examples

remove_unique(ex_3factors_doub_unique, n = 420)

Split-Half Eigenvector Matching (SHEM)

Description

shem estimates the number of principal components via Split-Half Eigenvector Matching (SHEM).

Usage

shem(data, nIts = 30)

Arguments

data

a data frame, a numeric matrix, covariance matrix or correlation matrix from which to determine the number of factors.

nIts

number of iterations.

Value

shem returns a list containing the number of components, nfactors, whether the additional step in case of zero true latent components was carried, zeroComponents, the eigenvalues and the eigenvectors of the solution.

References

Galdwin, T. E. (2023) Estimating the number of principal components via Split-Half Eigenvector Matching (SHEM). MethodsX, 11, 102286. doi:10.1016/j.mex.2023.102286

Examples

jd <- genr8(n = 404, R = ex_4factors_corr)
shem(jd)

Summary results of NEST

Description

summary method for class "nest".

Usage

## S3 method for class 'nest'
summary(object, ...)

Arguments

object

an object of class "nest".

...

further arguments for other methods, ignored for "nest".

Value

No returned value, called for side effects.

Examples

results <- nest(ex_2factors, n = 100)
summary(results)

A covariance matrix composed of 11 variables.

Description

A sample covariance matrix composed of 11 items based on two factors according to Tabachnick and Fidell (2019, see, 576-578). The first five variables are related to "Verbak IQ", the next five are related to "Performance IQ". The last variable CODING is unique. Loadings range between .39 and .76.

Usage

tabachnick_fidell2019

Format

A 11 by 11 covariance matrix.

Source

https://github.com/quantmeth

References

Tabachnick, B. G., & Fidell, L. S. (2019). Using multivariate statistics. Allyn and Bacon. p. 576-577.


Probability of unique variables

Description

Probability of unique variables

Usage

unique_variable(.data, n = NULL, ...)

Arguments

.data

a data frame, a numeric matrix, covariance matrix or correlation matrix from which to determine the number of factors.

n

the number of cases (subjects, participants, or units) if a covariance matrix is supplied in .data.

...

further arguments for cov_nest().

Value

A data frame containing the F-values and probabilities of the variable to be an unique variable.

Author(s)

P.-O. Caron (R) André Achim (Matlab)

Examples

exData <- genr8(n = 420, R = ex_3factors_doub_unique)
unique_variable(exData)