SimilarityConfig
Class for the configuration of the normalizing and similarity stuff. Important: Make sure you use the same configuration for indexing and searching/comparing. Otherwise there may be unwanted side effects.
Value parameters
- DateComparisonMethod
-
Method which date parts are to be compared. Currently only 0 = year is supported. Default is 0.
- allowOneLetterAbbreviation
-
Defines whether abbreviations with a letter are taken into account. With true, for example, Benjamin is a hit with B. Default is false.
- checkCountryForSearchHit
-
Defines whether the country should be considered or not. Default is true.
- checkCountyForAdressSearch
-
Defines whether the country should be considered or not in address search. Countries overrules stop or hit words. Default is true.
- checkDateForSearchHit
-
Defines whether the date should be taken into account or not. Default is true.
- fuzzyScoreForAddressSearch
-
Value of the fuzziness to identify individual elements of an address as hits. Value between 0.1 and 1. Default is 0.8.
- matchSelectionMode
-
Method of how a match has to be determined: 0 = Based on simialrity. 1 = Based on nofHits (number of hits). Default is 0.
- maxDateYearDifferenceForHit
-
Defines the uncertainty/tolerance in the annual comparison in number of years (+/-). Default is 2.
- maxNumberOfCandidatesFromSearch
-
Defines the maximum number of candidates to be considered by the IR search, from which hits are then determined. Default is 10000
- nameElementSimilarityForHit
-
Minimum similarity to mark as hit. Default is 0.9.
- normOrgCountryWeight
-
Weight (reduction) of a country match (recommended: < 1, default is 0.5).
- normOrgLegalformWeight
-
Weight (reduction) of a legal form match (recommended: < 1, default is 0.25).
- numberOfHitsForAddressSearchHit
-
Minimum number of elements that must be found for the address to be considered a hit. Default is 2.
- numberOfHitsForSearchHit
-
Value of the nofHits (number of hits) from which the comparison is classified as a hit. Default is 2.
- oneLetterAbbreviationWeight
-
If abbreviations are taken into account, this value defines the weight (reduction) of such a hit. Default is 0.5.
- searchEntityGroupMode
-
Defines the field by which the hits are to be grouped. Depends which value is unique: 0 = externalId, 1 = Id. Default is 0.
- similarityValueForSearchHit
-
Value of the similarity from which the comparison is classified as a hit. Default is 0.9.
Attributes
- Graph
-
- Supertypes
-
trait Serializabletrait Producttrait Equalsclass Objecttrait Matchableclass Any