Class NormalizedModifiedPurity<V>

java.lang.Object
org.nlpub.watset.eval.NormalizedModifiedPurity<V>
Type Parameters:
V - the type of cluster elements
Direct Known Subclasses:
CachedNormalizedModifiedPurity

public class NormalizedModifiedPurity<V> extends Object
Normalized modified purity evaluation measure for overlapping clustering.

Please be especially careful with the hashCode and equals methods of the elements.

See Also:
  • Constructor Details

    • NormalizedModifiedPurity

      public NormalizedModifiedPurity()
      Construct a normalized modified purity calculator.
    • NormalizedModifiedPurity

      public NormalizedModifiedPurity(boolean normalized, boolean modified)
      Construct a normalized modified purity calculator that allows turning normalized and/or modified options off.
      Parameters:
      normalized - normalized purity is on
      modified - modified purity is on
  • Method Details

    • transform

      public static <V> List<Map<V,Double>> transform(List<? extends Collection<V>> clusters)
      Transform a collection of clusters into a collection of weighted cluster elements.
      Type Parameters:
      V - the type of cluster elements
      Parameters:
      clusters - the collection of clusters
      Returns:
      a collection of weighted cluster elements
    • normalize

      public static <V> List<Map<V,Double>> normalize(Collection<Map<V,Double>> clusters)
      Normalize weights of the cluster elements to allow using normalized (modified) purity.
      Type Parameters:
      V - the type of cluster elements
      Parameters:
      clusters - the collection of clusters
      Returns:
      a collection of weight-normalized clusters
    • evaluate

      public static <V> PrecisionRecall evaluate(NormalizedModifiedPurity<V> precision, NormalizedModifiedPurity<V> recall, Collection<Map<V,Double>> clusters, Collection<Map<V,Double>> classes)
      Compute a precision and recall using purity and inverse purity, correspondingly.
      Type Parameters:
      V - the type of cluster elements
      Parameters:
      precision - the purity
      recall - the inverse purity
      clusters - the collection of the clusters to evaluate
      classes - the collection of the gold standard clusters
      Returns:
      precision and recalled wrapped in an instance of PrecisionRecall
    • purity

      public double purity(Collection<Map<V,Double>> clusters, Collection<Map<V,Double>> classes)
      Computes the (modified) purity of the given clusters as according to the gold standard clustering, classes.
      Parameters:
      clusters - the collection of the clusters to evaluate
      classes - the collection of the gold standard clusters
      Returns:
      (modified) purity
      See Also:
    • score

      public double score(Map<V,Double> cluster, Collection<Map<V,Double>> classes)
      Compute the (modified) cluster score on a defined collection of classes.
      Parameters:
      cluster - the cluster to evaluate
      classes - the collection of the gold standard clusters
      Returns:
      cluster score
    • delta

      public double delta(Map<V,Double> cluster, Map<V,Double> klass)
      Compute the fuzzy overlap between two clusters, cluster and klass.

      In case of modified purity the singleton clusters are ignored.

      Parameters:
      cluster - the first cluster
      klass - the second cluster
      Returns:
      cluster overlap measure
      See Also: