PlnNetworkCollection

For an in-depth tutorial to the PlnNetworkCollection model, see the network analysis tutorial.

PlnNetworkCollection Documentation

class pyPLNmodels.PlnNetworkCollection(endog, *, exog=None, offsets=None, compute_offsets_method='zero', add_const=True, penalties=(1, 10, 100, 1000), penalty_coef=0, penalty_coef_type='lasso')[source]

A collection of PlnNetwork models, each with a different penalty. For more details, see: J. Chiquet, S. Robin, M. Mariadassou: “Variational Inference for sparse network reconstruction from count data”

Unlike the PlnNetwork, the penalty coef can not be changed at fitting time.

Examples

>>> from pyPLNmodels import PlnNetworkCollection, load_scrna
>>> data = load_scrna()
>>> nets = PlnNetworkCollection.from_formula("endog ~ 1", data = data, penalties = [1,10, 100])
>>> nets.fit()
>>> print(nets)
>>> nets.show()
>>> print(nets.best_model())
>>> print(nets[10])

Examples

>>> from pyPLNmodels import PlnNetworkCollection, load_scrna
>>> data = load_scrna()
>>> nets = PlnNetworkCollection(endog = data["endog"], penalties = [1,10,100])
>>> nets.fit()
>>> print(nets.best_model())

classmethod from_formula(formula, data, *, compute_offsets_method='zero', penalties=(1, 10, 100, 1000), penalty_coef=0, penalty_coef_type='lasso')[source]

Create an instance from a formula and data.

Parameters:

formula (str) – The formula.
data (dict) – The data dictionary. Each value can be either a torch.Tensor, np.ndarray, pd.DataFrame or pd.Series. The categorical exogenous variables should be 1-dimensional.
compute_offsets_method (str, optional(keyword-only)) –
Method to compute offsets if not provided. Options are:
”zero” that will set the offsets to zero.

”logsum” that will take the logarithm of the sum (per line) of the counts.
Overridden (useless) if data[“offsets”] is not None.
penaltiesIterable[float], optional(keyword-only)
The penalties that needs to be tested. By default (1, 10, 100, 1000).
penalty_coef (float) –

The penalty parameter for the coef matrix. The larger the penalty, the larger the
sparsity of the coef matrix. Default is 0 (no penalty).
penalty_coef_type (optional ("lasso", "group_lasso", "sparse_group_lasso")) –
The penalty type for the coef. Useless if penalty_coef is 0. Can be either:
- ”lasso”: Enforces sparsity on each coefficient independently, encouraging
  many coefficients to be exactly zero.
- ”group_lasso”: Enforces group sparsity, encouraging entire groups of
  coefficients (e.g., corresponding to a covariate) to be zero.
- ”sparse_group_lasso”: Combines the effects of “lasso” and
  ”group_lasso”, enforcing both individual and group sparsity.
penalties (Iterable[int] | None)

property precision: Dict[float, Tensor]

Property representing the precision of each model in the collection.

Returns:: The precision for each model.
Return type:: Dict[int, torch.Tensor]

property penalties

Property representing the penalties of each model in the collection.

Returns:: The penalties.
Return type:: List[float]

fit(maxiter=400, lr=0.01, tol=1e-06, verbose=False)[source]

Fit each model in the collection.

Parameters:

maxiter (int, optional) – The maximum number of iterations to be done, by default 400.
lr (float, optional(keyword-only)) – The learning rate, by default 0.01.
tol (float, optional(keyword-only)) – The tolerance, by default 1e-6.
verbose (bool, optional(keyword-only)) – Whether to print verbose output, by default False.

Return type:

Collection

Return type:

PlnNetworkCollection

Examples

>>> from pyPLNmodels import PlnNetworkCollection, load_scrna
>>> data = load_scrna()
>>> nets = PlnNetworkCollection(endog = data["endog"], penalties = [1,10,100])
>>> nets.fit()

property latent_mean: Dict[int, Tensor]

Property representing the latent mean, for each model in the collection.

Returns:: The latent means.
Return type:: Dict[int, torch.Tensor]

property latent_variance: Dict[int, Tensor]

Property representing the latent variance, for each model in the collection.

Returns:: The latent variances.
Return type:: Dict[int, torch.Tensor]

best_model(criterion='BIC')[source]

Get the best model according to the specified criterion.

Parameters:: criterion (str, optional) – The criterion to use (‘AIC’ or ‘BIC’), by default ‘BIC’.
Returns:: The best model.
Return type:: Any
Return type:: PlnNetworkCollection

Examples

>>> from pyPLNmodels import PlnNetworkCollection, load_scrna
>>> data = load_scrna()
>>> nets = PlnNetworkCollection(endog = data["endog"], penalties = [1, 10, 100])
>>> nets.fit()
>>> print(nets.best_model())

property components_prec: Dict[float, Tensor]

Property representing the components of the precision matrix for each model in the collection.

Returns:: The components of the precision.
Return type:: Dict[int, torch.Tensor]

property nb_links: Number of links of each model.

show(figsize=(15, 10))[source]

Show a plot with BIC scores, AIC scores, and negative log-likelihoods of the models. Also show the number of links in the model. The AIC and BIC criteria might not always provide meaningful guidance for selecting the penalty. Instead, we recommend focusing on the desired number of links.

Parameters:: figsize (tuple of two positive floats.) – Size of the figure that will be created. By default (10,15)

property AIC: Dict[int, int]

Property representing the AIC scores of the models in the collection.

Returns:: The AIC scores of the models.
Return type:: Dict[int, float]

property BIC: Dict[int, int]

Property representing the BIC scores of the models in the collection.

Returns:: The BIC scores of the models.
Return type:: Dict[int, float]

property ICL: Dict[int, int]

Property representing the ICL scores of the models in the collection.

Returns:: The ICL scores of the models.
Return type:: Dict[int, float]

property coef: Dict[int, Tensor]

Property representing the coefficients of the collection.

Returns:: The coefficients.
Return type:: Dict[float, torch.Tensor]

property dim: int: Number of dimensions (i.e. variables) of the dataset.

property endog: Tensor

Property representing the endogenous variables (counts).

Returns:: The endogenous variables.
Return type:: torch.Tensor

property exog: Tensor

Property representing the exogenous variables (covariates).

Returns:: The exogenous variables or None if no covariates are given in the model.
Return type:: torch.Tensor or None

get(key, default)

Get the model with the specified key, or return a default value if the key does not exist.

Parameters:

key (Any) – The key to search for.
default (Any) – The default value to return if the key does not exist.

Returns:

The model with the specified key, or the default value if the key does not exist.

Return type:

Any

property grid: List[float]

Property representing the grid given in initialization.

Returns:: The grid.
Return type:: List[float]

items()

Get the key-value pairs of the models in the collection.

Returns:: The key-value pairs of the models.
Return type:: ItemsView

keys()

Get the grid of the collection.

Returns:: The grid of the collection.
Return type:: KeysView

property loglike: Dict[int, float]

Property representing the log-likelihoods of the models in the collection.

Returns:: The log-likelihoods of the models.
Return type:: Dict[int, float]

property n_samples: Number of samples in the dataset.

property nb_cov: int: The number of exogenous variables.

property offsets: Tensor

Property representing the offsets.

Returns:: The offsets.
Return type:: torch.Tensor

values()

Models in the collection as a list.

Returns:: The models in the collection.
Return type:: ValuesView

List of methods and attributes

Public Data Attributes:

`precision`	Property representing the precision of each model in the collection.
`penalties`	Property representing the penalties of each model in the collection.
`latent_mean`	Property representing the latent mean, for each model in the collection.
`latent_variance`	Property representing the latent variance, for each model in the collection.
`components_prec`	Property representing the components of the precision matrix for each model in the collection.
`nb_links`	Number of links of each model.

Inherited from Collection

`exog`	Property representing the exogenous variables (covariates).
`offsets`	Property representing the offsets.
`endog`	Property representing the endogenous variables (counts).
`n_samples`	Number of samples in the dataset.
`grid`	Property representing the grid given in initialization.
`coef`	Property representing the coefficients of the collection.
`dim`	Number of dimensions (i.e. variables) of the dataset.
`nb_cov`	The number of exogenous variables.
`BIC`	Property representing the BIC scores of the models in the collection.
`ICL`	Property representing the ICL scores of the models in the collection.
`AIC`	Property representing the AIC scores of the models in the collection.
`loglike`	Property representing the log-likelihoods of the models in the collection.
`PlnModel`

Public Methods:

`__init__`(endog, *[, exog, offsets, ...])	Initializes the collection.
`from_formula`(formula, data, *[, ...])	Create an instance from a formula and data.
`fit`([maxiter, lr, tol, verbose])	Fit each model in the collection.
`best_model`([criterion])	Get the best model according to the specified criterion.
`show`([figsize])	Show a plot with BIC scores, AIC scores, and negative log-likelihoods of the models.

Inherited from Collection

`__init__`(endog, grid, *[, exog, offsets, ...])	Initializes the collection.
`from_formula`(formula, data, grid, *[, ...])	Create an instance from a formula and data.
`values`()	Models in the collection as a list.
`items`()	Get the key-value pairs of the models in the collection.
`__getitem__`(grid_value)	Model with the specified grid_value.
`__len__`()	Number of models in the collection.
`__iter__`()	Iterate over the models in the collection.
`__contains__`(grid_value)	Check if a model with the specified grid_value exists in the collection.
`keys`()	Get the grid of the collection.
`get`(key, default)	Get the model with the specified key, or return a default value if the key does not exist.
`fit`([maxiter, lr, tol, verbose])	Fit each model in the collection.
`best_model`([criterion])	Get the best model according to the specified criterion.
`show`([figsize])	Show a plot with BIC scores, AIC scores, and negative log-likelihoods of the models.
`__repr__`()	Return a string representation of the Collection object.

Private Data Attributes:

`_grid_value_name`
`_abc_impl`

Inherited from Collection

`_useful_methods_strings`
`_useful_attributes_string`
`_name`
`_abc_impl`

Inherited from ABC

_abc_impl

Private Methods:

`_instantiate_model`(grid_value)
`_is_right_instance`(grid_value)
`_init_next_model_with_current_model`(...)	Initialize the next PlnModel model with the parameters of the current PlnModel model.

Inherited from Collection

`_init_models`(grid)	Method for initializing the models.
`_is_right_instance`(grid_value)
`_set_column_names`(model)
`_instantiate_model`(grid_value)
`_print_beginning_message`()
`_init_next_model_with_current_model`(...)	Initialize the next PlnModel model with the parameters of the current PlnModel model.
`_print_ending_message`()
`_best_grid_value`(criterion)