Packages

c

esc.configuration

SimilarityConfig

case class SimilarityConfig(normOrgLegalformWeight: Double = 0.25, normOrgCountryWeight: Double = 0.5, nameElementSimilarityForHit: Double = 0.9, matchSelectionMode: Int = 0, checkDateForSearchHit: Boolean = true, dateComparisonMethod: Int = 0, maxDateYearDifferenceForHit: Int = 2, checkCountryForSearchHit: Boolean = true, similarityValueForSearchHit: Double = 0.9, numberOfHitsForSearchHit: Int = 2, maxNumberOfCandidatesFromSearch: Int = 10000, searchEntityGroupMode: Int = 0, allowOneLetterAbbreviation: Boolean = false, oneLetterAbbreviationWeight: Double = 0.5, checkCountyForAdressSearch: Boolean = true, numberOfHitsForAddressSearchHit: Int = 2, fuzzyScoreForAddressSearch: Double = 0.8) extends Product with Serializable

Class for the configuration of the normalizing and similarity stuff. Important: Make sure you use the same configuration for indexing and searching/comparing. Otherwise there may be unwanted side effects.

normOrgLegalformWeight

Weight (reduction) of a legal form match (recommended: < 1, default is 0.25).

normOrgCountryWeight

Weight (reduction) of a country match (recommended: < 1, default is 0.5).

nameElementSimilarityForHit

Minimum similarity to mark as hit. Default is 0.9.

matchSelectionMode

Method of how a match has to be determined: 0 = Based on simialrity. 1 = Based on nofHits (number of hits). Default is 0.

checkDateForSearchHit

Defines whether the date should be taken into account or not. Default is true.

maxDateYearDifferenceForHit

Defines the uncertainty/tolerance in the annual comparison in number of years (+/-). Default is 2.

checkCountryForSearchHit

Defines whether the country should be considered or not. Default is true.

similarityValueForSearchHit

Value of the similarity from which the comparison is classified as a hit. Default is 0.9.

numberOfHitsForSearchHit

Value of the nofHits (number of hits) from which the comparison is classified as a hit. Default is 2.

maxNumberOfCandidatesFromSearch

Defines the maximum number of candidates to be considered by the IR search, from which hits are then determined. Default is 10000

searchEntityGroupMode

Defines the field by which the hits are to be grouped. Depends which value is unique: 0 = externalId, 1 = Id. Default is 0.

allowOneLetterAbbreviation

Defines whether abbreviations with a letter are taken into account. With true, for example, Benjamin is a hit with B. Default is false.

oneLetterAbbreviationWeight

If abbreviations are taken into account, this value defines the weight (reduction) of such a hit. Default is 0.5.

checkCountyForAdressSearch

Defines whether the country should be considered or not in address search. Countries overrules stop or hit words. Default is true.

numberOfHitsForAddressSearchHit

Minimum number of elements that must be found for the address to be considered a hit. Default is 2.

fuzzyScoreForAddressSearch

Value of the fuzziness to identify individual elements of an address as hits. Value between 0.1 and 1. Default is 0.8.

Linear Supertypes
Serializable, Product, Equals, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. SimilarityConfig
  2. Serializable
  3. Product
  4. Equals
  5. AnyRef
  6. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. Protected

Instance Constructors

  1. new SimilarityConfig(normOrgLegalformWeight: Double = 0.25, normOrgCountryWeight: Double = 0.5, nameElementSimilarityForHit: Double = 0.9, matchSelectionMode: Int = 0, checkDateForSearchHit: Boolean = true, dateComparisonMethod: Int = 0, maxDateYearDifferenceForHit: Int = 2, checkCountryForSearchHit: Boolean = true, similarityValueForSearchHit: Double = 0.9, numberOfHitsForSearchHit: Int = 2, maxNumberOfCandidatesFromSearch: Int = 10000, searchEntityGroupMode: Int = 0, allowOneLetterAbbreviation: Boolean = false, oneLetterAbbreviationWeight: Double = 0.5, checkCountyForAdressSearch: Boolean = true, numberOfHitsForAddressSearchHit: Int = 2, fuzzyScoreForAddressSearch: Double = 0.8)

    normOrgLegalformWeight

    Weight (reduction) of a legal form match (recommended: < 1, default is 0.25).

    normOrgCountryWeight

    Weight (reduction) of a country match (recommended: < 1, default is 0.5).

    nameElementSimilarityForHit

    Minimum similarity to mark as hit. Default is 0.9.

    matchSelectionMode

    Method of how a match has to be determined: 0 = Based on simialrity. 1 = Based on nofHits (number of hits). Default is 0.

    checkDateForSearchHit

    Defines whether the date should be taken into account or not. Default is true.

    maxDateYearDifferenceForHit

    Defines the uncertainty/tolerance in the annual comparison in number of years (+/-). Default is 2.

    checkCountryForSearchHit

    Defines whether the country should be considered or not. Default is true.

    similarityValueForSearchHit

    Value of the similarity from which the comparison is classified as a hit. Default is 0.9.

    numberOfHitsForSearchHit

    Value of the nofHits (number of hits) from which the comparison is classified as a hit. Default is 2.

    maxNumberOfCandidatesFromSearch

    Defines the maximum number of candidates to be considered by the IR search, from which hits are then determined. Default is 10000

    searchEntityGroupMode

    Defines the field by which the hits are to be grouped. Depends which value is unique: 0 = externalId, 1 = Id. Default is 0.

    allowOneLetterAbbreviation

    Defines whether abbreviations with a letter are taken into account. With true, for example, Benjamin is a hit with B. Default is false.

    oneLetterAbbreviationWeight

    If abbreviations are taken into account, this value defines the weight (reduction) of such a hit. Default is 0.5.

    checkCountyForAdressSearch

    Defines whether the country should be considered or not in address search. Countries overrules stop or hit words. Default is true.

    numberOfHitsForAddressSearchHit

    Minimum number of elements that must be found for the address to be considered a hit. Default is 2.

    fuzzyScoreForAddressSearch

    Value of the fuzziness to identify individual elements of an address as hits. Value between 0.1 and 1. Default is 0.8.

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##: Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. val allowOneLetterAbbreviation: Boolean
  5. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  6. val checkCountryForSearchHit: Boolean
  7. val checkCountyForAdressSearch: Boolean
  8. val checkDateForSearchHit: Boolean
  9. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.CloneNotSupportedException]) @HotSpotIntrinsicCandidate() @native()
  10. val dateComparisonMethod: Int
  11. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  12. val fuzzyScoreForAddressSearch: Double
  13. final def getClass(): Class[_ <: AnyRef]
    Definition Classes
    AnyRef → Any
    Annotations
    @HotSpotIntrinsicCandidate() @native()
  14. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  15. val matchSelectionMode: Int
  16. val maxDateYearDifferenceForHit: Int
  17. val maxNumberOfCandidatesFromSearch: Int
  18. val nameElementSimilarityForHit: Double
  19. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  20. val normOrgCountryWeight: Double
  21. val normOrgLegalformWeight: Double
  22. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @HotSpotIntrinsicCandidate() @native()
  23. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @HotSpotIntrinsicCandidate() @native()
  24. val numberOfHitsForAddressSearchHit: Int
  25. val numberOfHitsForSearchHit: Int
  26. val oneLetterAbbreviationWeight: Double
  27. def productElementNames: Iterator[String]
    Definition Classes
    Product
  28. val searchEntityGroupMode: Int
  29. val similarityValueForSearchHit: Double
  30. final def synchronized[T0](arg0: => T0): T0
    Definition Classes
    AnyRef
  31. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  32. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException]) @native()
  33. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])

Deprecated Value Members

  1. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.Throwable]) @Deprecated
    Deprecated

    (Since version 9)

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from AnyRef

Inherited from Any

Ungrouped