esc.configuration

Members list

Type members

Classlikes

case class SimilarityConfig(normOrgLegalformWeight: Double, normOrgCountryWeight: Double, nameElementSimilarityForHit: Double, matchSelectionMode: Int, checkDateForSearchHit: Boolean, dateComparisonMethod: Int, maxDateYearDifferenceForHit: Int, checkCountryForSearchHit: Boolean, similarityValueForSearchHit: Double, numberOfHitsForSearchHit: Int, maxNumberOfCandidatesFromSearch: Int, searchEntityGroupMode: Int, allowOneLetterAbbreviation: Boolean, oneLetterAbbreviationWeight: Double, checkCountyForAdressSearch: Boolean, numberOfHitsForAddressSearchHit: Int, fuzzyScoreForAddressSearch: Double)

Class for the configuration of the normalizing and similarity stuff. Important: Make sure you use the same configuration for indexing and searching/comparing. Otherwise there may be unwanted side effects.

Class for the configuration of the normalizing and similarity stuff. Important: Make sure you use the same configuration for indexing and searching/comparing. Otherwise there may be unwanted side effects.

Value parameters

DateComparisonMethod

Method which date parts are to be compared. Currently only 0 = year is supported. Default is 0.

allowOneLetterAbbreviation

Defines whether abbreviations with a letter are taken into account. With true, for example, Benjamin is a hit with B. Default is false.

checkCountryForSearchHit

Defines whether the country should be considered or not. Default is true.

checkCountyForAdressSearch

Defines whether the country should be considered or not in address search. Countries overrules stop or hit words. Default is true.

checkDateForSearchHit

Defines whether the date should be taken into account or not. Default is true.

fuzzyScoreForAddressSearch

Value of the fuzziness to identify individual elements of an address as hits. Value between 0.1 and 1. Default is 0.8.

matchSelectionMode

Method of how a match has to be determined: 0 = Based on simialrity. 1 = Based on nofHits (number of hits). Default is 0.

maxDateYearDifferenceForHit

Defines the uncertainty/tolerance in the annual comparison in number of years (+/-). Default is 2.

maxNumberOfCandidatesFromSearch

Defines the maximum number of candidates to be considered by the IR search, from which hits are then determined. Default is 10000

nameElementSimilarityForHit

Minimum similarity to mark as hit. Default is 0.9.

normOrgCountryWeight

Weight (reduction) of a country match (recommended: < 1, default is 0.5).

normOrgLegalformWeight

Weight (reduction) of a legal form match (recommended: < 1, default is 0.25).

numberOfHitsForAddressSearchHit

Minimum number of elements that must be found for the address to be considered a hit. Default is 2.

numberOfHitsForSearchHit

Value of the nofHits (number of hits) from which the comparison is classified as a hit. Default is 2.

oneLetterAbbreviationWeight

If abbreviations are taken into account, this value defines the weight (reduction) of such a hit. Default is 0.5.

searchEntityGroupMode

Defines the field by which the hits are to be grouped. Depends which value is unique: 0 = externalId, 1 = Id. Default is 0.

similarityValueForSearchHit

Value of the similarity from which the comparison is classified as a hit. Default is 0.9.

Attributes

Supertypes
trait Serializable
trait Product
trait Equals
class Object
trait Matchable
class Any
Show all

Sugar object for creating SimilarityConfig when using Java. For using Scala "new SimilarityConfig()" is exactly the same.

Sugar object for creating SimilarityConfig when using Java. For using Scala "new SimilarityConfig()" is exactly the same.

Attributes

Supertypes
class Object
trait Matchable
class Any
Self type