Scores

This page describes *Scores classes.

See detailed descrition of scores Scores Description for understanding their sense.

class artm.SparsityPhiScore(name=None, class_id=None, topic_names=None, model_name=None, eps=None, config=None)
__init__(name=None, class_id=None, topic_names=None, model_name=None, eps=None, config=None)
Parameters:
  • name (str) – the identifier of score, will be auto-generated if not specified
  • class_id (str) – class_id to score
  • topic_names (list of str or str or None) – list of names or single name of topic to regularize, will score all topics if empty or None
  • model_name – phi-like matrix to be scored (typically ‘pwt’ or ‘nwt’), ‘pwt’ if not specified
  • eps (float) – the tolerance const, everything < eps considered to be zero
  • config (protobuf object) – the low-level config of this score
class artm.ItemsProcessedScore(name=None, config=None)
__init__(name=None, config=None)
Parameters:
  • name (str) – the identifier of score, will be auto-generated if not specified
  • config (protobuf object) – the low-level config of this score
class artm.PerplexityScore(name=None, transaction_typenames=None, class_ids=None, dictionary=None, config=None)
__init__(name=None, transaction_typenames=None, class_ids=None, dictionary=None, config=None)
Parameters:
  • name (str) – the identifier of score, will be auto-generated if not specified
  • transaction_typenames (list of str) – transaction_typenames to score, None means that tokens of all transaction_typenames will be used
  • class_ids (list of str) – class_ids to score, None means that tokens of all class_ids will be used
  • dictionary (str or reference to Dictionary object) – BigARTM collection dictionary, is strongly recommended to be used for correct replacing of zero counters.
  • config (protobuf object) – the low-level config of this score
class artm.SparsityThetaScore(name=None, topic_names=None, eps=None, config=None)
__init__(name=None, topic_names=None, eps=None, config=None)
Parameters:
  • name (str) – the identifier of score, will be auto-generated if not specified
  • topic_names (list of str or str or None) – list of names or single name of topic to regularize, will score all topics if empty or None
  • eps (float) – the tolerance const, everything < eps considered to be zero
  • config (protobuf object) – the low-level config of this score
class artm.ThetaSnippetScore(name=None, item_ids=None, num_items=None, config=None)
__init__(name=None, item_ids=None, num_items=None, config=None)
Parameters:
  • name (str) – the identifier of score, will be auto-generated if not specified
  • item_ids (list of int) – list of names of items to show, default=None
  • num_items (int) – number of theta vectors to show from the beginning (no sense if item_ids was given)
  • config (protobuf object) – the low-level config of this score
class artm.TopicKernelScore(name=None, class_id=None, topic_names=None, eps=None, dictionary=None, probability_mass_threshold=None, config=None)
__init__(name=None, class_id=None, topic_names=None, eps=None, dictionary=None, probability_mass_threshold=None, config=None)
Parameters:
  • name (str) – the identifier of score, will be auto-generated if not specified
  • class_id (str) – class_id to score
  • topic_names (list of str or str or None) – list of names or single name of topic to regularize, will score all topics if empty or None
  • probability_mass_threshold (float) – the threshold for p(t|w) values to get token into topic kernel. Should be in (0, 1)
  • dictionary (str or reference to Dictionary object) – BigARTM collection dictionary, won’t use dictionary if not specified
  • eps (float) – the tolerance const, everything < eps considered to be zero
  • config (protobuf object) – the low-level config of this score
class artm.TopTokensScore(name=None, class_id=None, topic_names=None, num_tokens=None, dictionary=None, config=None)
__init__(name=None, class_id=None, topic_names=None, num_tokens=None, dictionary=None, config=None)
Parameters:
  • name (str) – the identifier of score, will be auto-generated if not specified
  • class_id (str) – class_id to score
  • topic_names (list of str or str or None) – list of names or single name of topic to regularize, will score all topics if empty or None
  • num_tokens (int) – number of tokens with max probability in each topic
  • dictionary (str or reference to Dictionary object) – BigARTM collection dictionary, won’t use dictionary if not specified
  • config (protobuf object) – the low-level config of this score
class artm.TopicMassPhiScore(name=None, class_ids=None, topic_names=None, model_name=None, eps=None, config=None)
__init__(name=None, class_ids=None, topic_names=None, model_name=None, eps=None, config=None)
Parameters:
  • name (str) – the identifier of score, will be auto-generated if not specified
  • class_ids (list of str) – class_id to score, means that tokens of all class_ids will be used
  • topic_names (list of str or str or None) – list of names or single name of topic to regularize, will score all topics if empty or None
  • model_name – phi-like matrix to be scored (typically ‘pwt’ or ‘nwt’), ‘pwt’ if not specified
  • eps (float) – the tolerance const, everything < eps considered to be zero
  • config (protobuf object) – the low-level config of this score
class artm.ClassPrecisionScore(name=None, config=None)
__init__(name=None, config=None)
Parameters:
  • name (str) – the identifier of score, will be auto-generated if not specified
  • config (protobuf object) – the low-level config of this score
class artm.BackgroundTokensRatioScore(name=None, class_id=None, delta_threshold=None, save_tokens=None, direct_kl=None, config=None)
__init__(name=None, class_id=None, delta_threshold=None, save_tokens=None, direct_kl=None, config=None)
Parameters:
  • name (str) – the identifier of score, will be auto-generated if not specified
  • class_id (str) – class_id to score
  • delta_threshold (float) – the threshold for KL-div between p(t|w) and p(t) to get token into background. Should be non-negative
  • save_tokens (bool) – save background tokens or not, save if field not specified
  • direct_kl (bool) – use KL(p(t) || p(t|w)) or via versa, true if field not specified
  • config (protobuf object) – the low-level config of this score