Scores¶

This page describes *Scores classes.

See detailed descrition of scores Scores Description for understanding their sense.

class artm.SparsityPhiScore(name=None, class_id=None, topic_names=None, model_name=None, eps=None, config=None)¶

__init__(name=None, class_id=None, topic_names=None, model_name=None, eps=None, config=None)¶

Parameters:

name (str) – the identifier of score, will be auto-generated if not specified
class_id (str) – class_id to score
topic_names (list of str or str or None) – list of names or single name of topic to regularize, will score all topics if empty or None
model_name – phi-like matrix to be scored (typically ‘pwt’ or ‘nwt’), ‘pwt’ if not specified
eps (float) – the tolerance const, everything < eps considered to be zero
config (protobuf object) – the low-level config of this score

class artm.ItemsProcessedScore(name=None, config=None)¶

__init__(name=None, config=None)¶

Parameters:	name (str) – the identifier of score, will be auto-generated if not specified config (protobuf object) – the low-level config of this score

class artm.PerplexityScore(name=None, transaction_typenames=None, class_ids=None, dictionary=None, config=None)¶

__init__(name=None, transaction_typenames=None, class_ids=None, dictionary=None, config=None)¶

Parameters:

name (str) – the identifier of score, will be auto-generated if not specified
transaction_typenames (list of str) – transaction_typenames to score, None means that tokens of all transaction_typenames will be used
class_ids (list of str) – class_ids to score, None means that tokens of all class_ids will be used
dictionary (str or reference to Dictionary object) – BigARTM collection dictionary, is strongly recommended to be used for correct replacing of zero counters.
config (protobuf object) – the low-level config of this score

class artm.SparsityThetaScore(name=None, topic_names=None, eps=None, config=None)¶

__init__(name=None, topic_names=None, eps=None, config=None)¶

Parameters:	name (str) – the identifier of score, will be auto-generated if not specified topic_names (list of str or str or None) – list of names or single name of topic to regularize, will score all topics if empty or None eps (float) – the tolerance const, everything < eps considered to be zero config (protobuf object) – the low-level config of this score

class artm.ThetaSnippetScore(name=None, item_ids=None, num_items=None, config=None)¶

__init__(name=None, item_ids=None, num_items=None, config=None)¶

Parameters:	name (str) – the identifier of score, will be auto-generated if not specified item_ids (list of int) – list of names of items to show, default=None num_items (int) – number of theta vectors to show from the beginning (no sense if item_ids was given) config (protobuf object) – the low-level config of this score

class artm.TopicKernelScore(name=None, class_id=None, topic_names=None, eps=None, dictionary=None, probability_mass_threshold=None, config=None)¶

__init__(name=None, class_id=None, topic_names=None, eps=None, dictionary=None, probability_mass_threshold=None, config=None)¶

Parameters:

name (str) – the identifier of score, will be auto-generated if not specified
class_id (str) – class_id to score
topic_names (list of str or str or None) – list of names or single name of topic to regularize, will score all topics if empty or None
probability_mass_threshold (float) – the threshold for p(t|w) values to get token into topic kernel. Should be in (0, 1)
dictionary (str or reference to Dictionary object) – BigARTM collection dictionary, won’t use dictionary if not specified
eps (float) – the tolerance const, everything < eps considered to be zero
config (protobuf object) – the low-level config of this score

class artm.TopTokensScore(name=None, class_id=None, topic_names=None, num_tokens=None, dictionary=None, config=None)¶

__init__(name=None, class_id=None, topic_names=None, num_tokens=None, dictionary=None, config=None)¶

Parameters:

name (str) – the identifier of score, will be auto-generated if not specified
class_id (str) – class_id to score
topic_names (list of str or str or None) – list of names or single name of topic to regularize, will score all topics if empty or None
num_tokens (int) – number of tokens with max probability in each topic
dictionary (str or reference to Dictionary object) – BigARTM collection dictionary, won’t use dictionary if not specified
config (protobuf object) – the low-level config of this score

class artm.TopicMassPhiScore(name=None, class_ids=None, topic_names=None, model_name=None, eps=None, config=None)¶

__init__(name=None, class_ids=None, topic_names=None, model_name=None, eps=None, config=None)¶

Parameters:

name (str) – the identifier of score, will be auto-generated if not specified
class_ids (list of str) – class_id to score, means that tokens of all class_ids will be used
topic_names (list of str or str or None) – list of names or single name of topic to regularize, will score all topics if empty or None
model_name – phi-like matrix to be scored (typically ‘pwt’ or ‘nwt’), ‘pwt’ if not specified
eps (float) – the tolerance const, everything < eps considered to be zero
config (protobuf object) – the low-level config of this score

class artm.ClassPrecisionScore(name=None, config=None)¶

__init__(name=None, config=None)¶

Parameters:	name (str) – the identifier of score, will be auto-generated if not specified config (protobuf object) – the low-level config of this score

class artm.BackgroundTokensRatioScore(name=None, class_id=None, delta_threshold=None, save_tokens=None, direct_kl=None, config=None)¶

__init__(name=None, class_id=None, delta_threshold=None, save_tokens=None, direct_kl=None, config=None)¶

Parameters:

name (str) – the identifier of score, will be auto-generated if not specified
class_id (str) – class_id to score
delta_threshold (float) – the threshold for KL-div between p(t|w) and p(t) to get token into background. Should be non-negative
save_tokens (bool) – save background tokens or not, save if field not specified
direct_kl (bool) – use KL(p(t) || p(t|w)) or via versa, true if field not specified
config (protobuf object) – the low-level config of this score