Feature metrics summary

This group of metrics measures properties of the information diffusion in terms of the features of the users and the information pieces.

External feature Gini complement
External feature rate
External feature recall
Feature entropy
Feature Gini complement
Feature KL divergence
Feature recall

External feature rate

This metric considers the proportion of the received features which are unknown to the users. We differentiate two cases:

Individual: we consider the proportion of the received features which are unknown to the each user, and then, we average over the set of people in the network.
Global: we globally count the number of received feature unknown to the users.

In order to determine which features are unknown to the users, we do the following: * Information features: we consider that a feature is unknown to the user if it does not appear in any of the information pieces created by the user. * User features: we consider that a feature is unknown to the user if the user does not have that feature.

Parameters

Both metrics receive the same parameters:

feature: the name of the features we are going to use.
userFeature: true if the studied features are user features, false if they are information pieces features.
unique: true if we just count each time a user receives a piece once, false otherwise.

Configuration file

For the individual feature rate, we have the following configuration file:

Individual external feature rate:
  feature:
    type: string
    value: feature_name
  userFeature:
    type: boolean
    value: true/false
  unique:
    type: boolean
    value: true/false

and, for the global version:

Global external feature rate:
  feature:
    type: string
    value: feature_name
  userFeature:
    type: boolean
    value: true/false
  unique:
    type: boolean
    value: true/false

External feature recall

For each user in the network, this metric computes the proportion of unknown features in the network which have been received. Then, this values are averaged to obtain the metric.

In order to determine which features are unknown to the users, we do the following: * Information features: we consider that a feature is unknown to the user if it does not appear in any of the information pieces created by the user. * User features: we consider that a feature is unknown to the user if the user does not have that feature.

Parameters

Both metrics receive the same parameters:

feature: the name of the features we are going to use.
userFeature: true if the studied features are user features, false if they are information pieces features.

Configuration file

External feature recall:
  feature:
    type: string
    value: feature_name
  userFeature:
    type: boolean
    value: true/false

External feature Gini complement

The external feature Gini complement measures how balanced is the number of times that each feature has been propagated. We also count the cases where the feature was unknown to the receiving user. We differentiate two variants:

Individual: we measure the number of times each user has received each feature, and find the Gini complement of that distribution. The final result is the average over all the users in the network.
Global: for each feature, we count the number of times that it has reached a user in the network. Then, the Gini complement is computed over that distribution.

In order to determine which features are unknown to the users, we do the following: * Information features: we consider that a feature is unknown to the user if it does not appear in any of the information pieces created by the user. * User features: we consider that a feature is unknown to the user if the user does not have that feature.

Parameters

Both metrics receive the same parameters:

feature: the name of the features we are going to use.
userFeature: true if the studied features are user features, false if they are information pieces features.
unique: true if we just count each time a user receives a piece once, false otherwise.

Configuration file

For the individual version, the configuration file is the following:

Individual external feature Gini complement:
  feature:
    type: string
    value: feature_name
  userFeature:
    type: boolean
    value: true/false
  unique:
    type: boolean
    value: true/false

whereas, for the global version, it is:

Global external feature Gini complement:

feature:
type: string value: feature_name

userFeature:
type: boolean value: true/false

unique:
type: boolean value: true/false

Feature entropy

The feature entropy measures the entropy of the distribution that counts the times that each feature has been propagated. We differentiate two variants:

Individual: we measure the number of times each user has received each feature, and find the entropy of that distribution. The final result is the average over all the users in the network.
Global: for each creator, we count the number of times that each feature has reached a user in the network. Then, the entropy is computed over that distribution.

Parameters

Both metrics receive the same parameters:

feature: the name of the features we are going to use.
userFeature: true if the studied features are user features, false if they are information pieces features.
unique: true if we just count each time a user receives a piece once, false otherwise.

Configuration file

For the individual version, the configuration file is the following:

Individual feature entropy:
  feature:
    type: string
    value: feature_name
  userFeature:
    type: boolean
    value: true/false
  unique:
    type: boolean
    value: true/false

whereas, for the global version, it is:

Global feature entropy:
  feature:
    type: string
    value: feature_name
  userFeature:
    type: boolean
    value: true/false
  unique:
    type: boolean
    value: true/false

Feature Gini complement

The feature Gini complement measures how balanced is the number of times that each feature has been propagated. We differentiate two variants:

Individual: we measure the number of times each user has received each feature, and find the Gini complement of that distribution (independently for each user). The final result is the average over all the users in the network.
Global: for each feature, we count the number of times that it has reached a user in the network. Then, the Gini complement is computed over that distribution.

Parameters

Both metrics receive the same parameters:

feature: the name of the features we are going to use.
userFeature: true if the studied features are user features, false if they are information pieces features.
unique: true if we just count each time a user receives a piece once, false otherwise.

Configuration file

For the individual version, the configuration file is the following:

Individual feature Gini complement:
  feature:
    type: string
    value: feature_name
  userFeature:
    type: boolean
    value: true/false
  unique:
    type: boolean
    value: true/false

whereas, for the global version, it is:

Global feature Gini complement:
  feature:
    type: string
    value: feature_name
  userFeature:
    type: boolean
    value: true/false
  unique:
    type: boolean
    value: true/false

Feature KL divergence

This metric uses the Kullback-Leibler divergence to compare a) the distribution of times each feature has been propagated during the simulation (the P distribution) and b) an approximation of the actual distribution of the features, estimated by counting the number of pieces created by each user and containing the corresponding features (the Q distribution):

\[KLD = - \sum_{f \in F} P(f) \log \left( \frac{P(f)}{Q(f)}\right)\]

We provide two different options:

Individual: we measure the KL-divergence for each user separately, and then we average.
Global: we measure the distributions for the whole network.

The metric measures the information gain achieved by the estimated distribution.

Parameters

Both metrics receive the same parameters:

feature: the name of the features we are going to use.
userFeature: true if the studied features are user features, false if they are information pieces features.
unique: true if we just count each time a user receives a piece once, false otherwise.

Configuration file

For the individual version, the configuration file is the following:

Individual feature KLD:
  feature:
    type: string
    value: feature_name
  userFeature:
    type: boolean
    value: true/false
  unique:
    type: boolean
    value: true/false

whereas, for the global version, it is:

Global feature KLD:
  feature:
    type: string
    value: feature_name
  userFeature:
    type: boolean
    value: true/false
  unique:
    type: boolean
    value: true/false

Feature recall

For each user in the network, this metric computes the proportion of features in the network which have been received. Then, this values are averaged to obtain the metric.

Parameters

Both metrics receive the same parameters:

feature: the name of the features we are going to use.
userFeature: true if the studied features are user features, false if they are information pieces features.

Configuration file

Feature recall:
  feature:
    type: string
    value: feature_name
  userFeature:
    type: boolean
    value: true/false

Feature user entropy

For each possible value of a feature, this metric counts the number of users who have received it, and computes the entropy of the distribution.

Parameters

Both metrics receive the same parameters:

feature: the name of the features we are going to use.
userFeature: true if the studied features are user features, false if they are information pieces features.

Configuration file

Feature user entropy:
  feature:
    type: string
    value: feature_name
  userFeature:
    type: boolean
    value: true/false

Feature user Gini complement

For each possible value of a feature, this metric counts the number of users who have received it, and computes the Gini complement of the distribution, to determine how balanced the distribution is.

Parameters

Both metrics receive the same parameters:

feature: the name of the features we are going to use.
userFeature: true if the studied features are user features, false if they are information pieces features.

Configuration file

Feature user Gini complement:
  feature:
    type: string
    value: feature_name
  userFeature:
    type: boolean
    value: true/false

User-feature count

This metric measures the number of different (user,feature) pairs which are observed during the diffusion. The (user, feature) pair is only observed if the user receives a content with the corresponding feature.

Parameters

Both metrics receive the same parameters:

feature: the name of the features we are going to use.
userFeature: true if the studied features are user features, false if they are information pieces features.

Configuration file

User-feature count:
  feature:
    type: string
    value: feature_name
  userFeature:
    type: boolean
    value: true/false

User-feature Gini complement

This metric measures the number of times that different (user,feature) pairs which are observed during the diffusion. The (user, feature) pair is only observed if the user receives a content with the corresponding feature. Then, this metric determines how balanced the distribution over those pairs is.

Parameters

Both metrics receive the same parameters:

feature: the name of the features we are going to use.
userFeature: true if the studied features are user features, false if they are information pieces features.
unique: true if we just count each time a user receives a piece once, false otherwise.

Configuration file

User-feature Gini complement:
  feature:
    type: string
    value: feature_name
  userFeature:
    type: boolean
    value: true/false
  unique:
    type: boolean
    value: true/false