Regularizers

This page describes KlFunctionInfo and *Regularizer classes.

See detailed descrition of regularizers for understanding their sense.

class artm.KlFunctionInfo(function_type='log', power_value=2.0)
__init__(function_type='log', power_value=2.0)
Parameters:
  • function_type (str) – the type of function, ‘log’ (logarithm) or ‘pol’ (polynomial)
  • power_value (float) – the double power of polynomial, ignored if type = ‘log’
class artm.SmoothSparsePhiRegularizer(name=None, tau=1.0, gamma=None, class_ids=None, topic_names=None, dictionary=None, kl_function_info=None, config=None)
__init__(name=None, tau=1.0, gamma=None, class_ids=None, topic_names=None, dictionary=None, kl_function_info=None, config=None)
Parameters:
  • name (str) – the identifier of regularizer, will be auto-generated if not specified
  • tau (float) – the coefficient of regularization for this regularizer
  • gamma (float) – the coefficient of relative regularization for this regularizer
  • class_ids (list of str) – list of class_ids to regularize, will regularize all classes if not specified
  • topic_names (list of str) – list of names of topics to regularize, will regularize all topics if not specified
  • dictionary (str or reference to Dictionary object) – BigARTM collection dictionary, won’t use dictionary if not specified
  • kl_function_info (KlFunctionInfo object) – class with additional info about function under KL-div in regularizer
  • config (protobuf object) – the low-level config of this regularizer
class artm.SmoothSparseThetaRegularizer(name=None, tau=1.0, topic_names=None, alpha_iter=None, kl_function_info=None, config=None)
__init__(name=None, tau=1.0, topic_names=None, alpha_iter=None, kl_function_info=None, config=None)
Parameters:
  • name (str) – the identifier of regularizer, will be auto-generated if not specified
  • tau (float) – the coefficient of regularization for this regularizer
  • alpha_iter (list of str) – list of additional coefficients of regularization on each iteration over document. Should have length equal to model.num_document_passes
  • topic_names (list of str) – list of names of topics to regularize, will regularize all topics if not specified
  • kl_function_info (KlFunctionInfo object) – class with additional info about function under KL-div in regularizer
  • config (protobuf object) – the low-level config of this regularizer
class artm.DecorrelatorPhiRegularizer(name=None, tau=1.0, gamma=None, class_ids=None, topic_names=None, config=None)
__init__(name=None, tau=1.0, gamma=None, class_ids=None, topic_names=None, config=None)
Parameters:
  • name (str) – the identifier of regularizer, will be auto-generated if not specified
  • tau (float) – the coefficient of regularization for this regularizer
  • gamma (float) – the coefficient of relative regularization for this regularizer
  • class_ids (list of str) – list of class_ids to regularize, will regularize all classes if not specified
  • topic_names (list of str) – list of names of topics to regularize, will regularize all topics if not specified
  • config (protobuf object) – the low-level config of this regularizer
class artm.LabelRegularizationPhiRegularizer(name=None, tau=1.0, gamma=None, class_ids=None, topic_names=None, dictionary=None, config=None)
__init__(name=None, tau=1.0, gamma=None, class_ids=None, topic_names=None, dictionary=None, config=None)
Parameters:
  • name (str) – the identifier of regularizer, will be auto-generated if not specified
  • tau (float) – the coefficient of regularization for this regularizer
  • gamma (float) – the coefficient of relative regularization for this regularizer
  • class_ids (list of str) – list of class_ids to regularize, will regularize all classes if not specified
  • topic_names (list of str) – list of names of topics to regularize, will regularize all topics if not specified
  • dictionary (str or reference to Dictionary object) – BigARTM collection dictionary, won’t use dictionary if not specified
  • config (protobuf object) – the low-level config of this regularizer
class artm.SpecifiedSparsePhiRegularizer(name=None, tau=1.0, gamma=None, topic_names=None, class_id=None, num_max_elements=None, probability_threshold=None, sparse_by_columns=True, config=None)
__init__(name=None, tau=1.0, gamma=None, topic_names=None, class_id=None, num_max_elements=None, probability_threshold=None, sparse_by_columns=True, config=None)
Parameters:
  • name (str) – the identifier of regularizer, will be auto-generated if not specified
  • tau (float) – the coefficient of regularization for this regularizer
  • gamma (float) – the coefficient of relative regularization for this regularizer
  • class_id – class_id to regularize
  • topic_names (list of str) – list of names of topics to regularize, will regularize all topics if not specified
  • num_max_elements (int) – number of elements to save in row/column
  • probability_threshold (float) – if m elements in row/column sum into value >= probability_threshold, m < n => only these elements would be saved. Value should be in (0, 1), default=None
  • sparse_by_columns (bool) – find max elements in column or in row
  • config (protobuf object) – the low-level config of this regularizer
class artm.ImproveCoherencePhiRegularizer(name=None, tau=1.0, gamma=None, class_ids=None, topic_names=None, dictionary=None, config=None)
__init__(name=None, tau=1.0, gamma=None, class_ids=None, topic_names=None, dictionary=None, config=None)
Parameters:
  • name (str) – the identifier of regularizer, will be auto-generated if not specified
  • tau (float) – the coefficient of regularization for this regularizer
  • gamma (float) – the coefficient of relative regularization for this regularizer
  • class_ids (list of str) – list of class_ids to regularize, will regularize all classes if not specified, dictionaty should contain pairwise tokens coocurancy info
  • topic_names (list of str) – list of names of topics to regularize, will regularize all topics if not specified
  • dictionary (str or reference to Dictionary object) – BigARTM collection dictionary, won’t use dictionary if not specified, in this case regularizer is useless
  • config (protobuf object) – the low-level config of this regularizer
class artm.SmoothPtdwRegularizer(name=None, tau=1.0, config=None)
__init__(name=None, tau=1.0, config=None)
Parameters:
  • name (str) – the identifier of regularizer, will be auto-generated if not specified
  • tau (float) – the coefficient of regularization for this regularizer
  • config (protobuf object) – the low-level config of this regularizer
class artm.TopicSelectionThetaRegularizer(name=None, tau=1.0, topic_names=None, alpha_iter=None, config=None)
__init__(name=None, tau=1.0, topic_names=None, alpha_iter=None, config=None)
Parameters:
  • name (str) – the identifier of regularizer, will be auto-generated if not specified
  • tau (float) – the coefficient of regularization for this regularizer
  • alpha_iter (list of str) – list of additional coefficients of regularization on each iteration over document. Should have length equal to model.num_document_passes
  • topic_names (list of str) – list of names of topics to regularize, will regularize all topics if not specified
  • config (protobuf object) – the low-level config of this regularizer