Class IntersectionSimilarity<T>
- java.lang.Object
-
- org.apache.commons.text.similarity.IntersectionSimilarity<T>
-
- Type Parameters:
T
- the type of the elements extracted from the character sequence
- All Implemented Interfaces:
SimilarityScore<IntersectionResult>
public class IntersectionSimilarity<T> extends java.lang.Object implements SimilarityScore<IntersectionResult>
Measures the intersection of two sets created from a pair of character sequences.It is assumed that the type
T
correctly conforms to the requirements for storage within aSet
orHashMap
. Ideally the type is immutable and implementsObject.equals(Object)
andObject.hashCode()
.- Since:
- 1.7
- See Also:
Set
,HashMap
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description private static class
IntersectionSimilarity.BagCount
Mutable counter class for storing the count of elements.private class
IntersectionSimilarity.TinyBag
A minimal implementation of a Bag that can store elements and a count.
-
Constructor Summary
Constructors Constructor Description IntersectionSimilarity(java.util.function.Function<java.lang.CharSequence,java.util.Collection<T>> converter)
Create a new intersection similarity using the provided converter.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description IntersectionResult
apply(java.lang.CharSequence left, java.lang.CharSequence right)
Calculates the intersection of two character sequences passed as input.private static <T> int
getIntersection(java.util.Set<T> setA, java.util.Set<T> setB)
Compute the intersection between two sets.private int
getIntersection(IntersectionSimilarity.TinyBag bagA, IntersectionSimilarity.TinyBag bagB)
Compute the intersection between two bags.private IntersectionSimilarity.TinyBag
toBag(java.util.Collection<T> objects)
Convert the collection to a bag.
-
-
-
Field Detail
-
converter
private final java.util.function.Function<java.lang.CharSequence,java.util.Collection<T>> converter
The converter used to create the elements from the characters.
-
-
Constructor Detail
-
IntersectionSimilarity
public IntersectionSimilarity(java.util.function.Function<java.lang.CharSequence,java.util.Collection<T>> converter)
Create a new intersection similarity using the provided converter.If the converter returns a
Set
then the intersection result will not include duplicates. Any otherCollection
is used to produce a result that will include duplicates in the intersect and union.- Parameters:
converter
- the converter used to create the elements from the characters- Throws:
java.lang.IllegalArgumentException
- if the converter is null
-
-
Method Detail
-
apply
public IntersectionResult apply(java.lang.CharSequence left, java.lang.CharSequence right)
Calculates the intersection of two character sequences passed as input.- Specified by:
apply
in interfaceSimilarityScore<T>
- Parameters:
left
- first character sequenceright
- second character sequence- Returns:
- The intersection result
- Throws:
java.lang.IllegalArgumentException
- if either input sequence isnull
-
toBag
private IntersectionSimilarity.TinyBag toBag(java.util.Collection<T> objects)
Convert the collection to a bag. The bag will contain the count of each element in the collection.- Parameters:
objects
- the objects- Returns:
- The bag
-
getIntersection
private static <T> int getIntersection(java.util.Set<T> setA, java.util.Set<T> setB)
Compute the intersection between two sets. This is the count of all the elements that are within both sets.- Type Parameters:
T
- the type of the elements in the set- Parameters:
setA
- the set AsetB
- the set B- Returns:
- The intersection
-
getIntersection
private int getIntersection(IntersectionSimilarity.TinyBag bagA, IntersectionSimilarity.TinyBag bagB)
Compute the intersection between two bags. This is the sum of the minimum count of each element that is within both sets.- Parameters:
bagA
- the bag AbagB
- the bag B- Returns:
- The intersection
-
-