esc.configuration

Members list

Type members

Classlikes

case class AiConfig(modelContextSize: Int = ..., modelBatchSize: Int = ..., modelGpuLayers: Int = ..., modelThreads: Int = ..., inferenceTemperature: Float = ..., inferenceTopK: Int = ..., inferenceTopP: Float = ..., inferenceMinP: Float = ..., inferenceRepeatPenalty: Float = ..., inferenceMaxTokens: Int = ..., inferencePresencePenalty: Float = ..., inferenceFrequencyPenalty: Float = ..., inferenceStopList: Array[String] = ..., agentSimilarityThresholdForHitToExplain: Double = ...)

Configuration class for the local LLM (llama.cpp via llama-cpp-java).

Configuration class for the local LLM (llama.cpp via llama-cpp-java).

All parameters are preconfigured for deterministic behaviour, low randomness, and minimal hallucinations. This is essential for entity and name-matching use cases, where predictable and reproducible outputs are required.

Value parameters

agentSimilarityThresholdForHitToExplain

Similarity threshold used by match-explanation logic. Determines whether two entities are considered sufficiently similar for detailed explanation or justification. Default: 0.8

inferenceFrequencyPenalty

Penalizes tokens proportionally to how often they appear in the output, reducing repetitive patterns. Similar to OpenAI's frequency penalty. Default: 0.2

inferenceMaxTokens

Maximum number of tokens the model may generate for a single inference request. Protects against runaway generation. Default: 256

inferenceMinP

Minimum probability threshold for token sampling. Prevents sampling from extremely low-probability tokens, improving determinism. Default: 0.1

inferencePresencePenalty

Penalizes tokens that already appear in the output to increase topic diversity. Similar to OpenAI's presence penalty. Default: 0.2

inferenceRepeatPenalty

Penalty factor applied to recently used tokens to reduce repetitive outputs. Values slightly above 1.0 discourage repetition. Default: 1.5

inferenceStopList

Array of strings marking stop conditions. If the model generates any of these strings, inference stops immediately. Default: empty array

inferenceTemperature

Sampling temperature controlling randomness. Lower values produce more deterministic outputs; higher values allow more creativity. Recommended: 0.2 – 0.8 Default: 0.6

inferenceTopK

Limits sampling to the top K highest-probability tokens. Lower values reduce randomness and help avoid hallucinations. Default: 5

inferenceTopP

Nucleus sampling threshold. Model samples only from tokens whose cumulative probability reaches top_p. Combines well with top_k. Default: 1.0 (disabled)

modelBatchSize

Number of tokens processed per batch during inference. Higher values may improve throughput but also increase memory consumption. Default: 32

modelContextSize

Size of the model’s context window in tokens. Determines how many tokens the model can keep in memory during inference. Default: 1024

modelGpuLayers

Number of model layers to offload onto the GPU. Requires llama.cpp compiled with GPU support. Improves inference speed if GPU is available. Use 0 to run entirely on CPU. Default: 0

modelThreads

Number of CPU threads used for inference. Defaults to the number of available CPU cores minus two, ensuring system responsiveness. Default: Runtime.availableProcessors - 2 (minimum 1)

Attributes

Supertypes
trait Serializable
trait Product
trait Equals
class Object
trait Matchable
class Any
Show all

Sugar object for creating AiConfig when using Java. For using Scala "new SimilarityConfig()" is exactly the same.

Sugar object for creating AiConfig when using Java. For using Scala "new SimilarityConfig()" is exactly the same.

Attributes

Supertypes
class Object
trait Matchable
class Any
Self type
case class SimilarityConfig(normOrgLegalformWeight: Double = ..., normOrgCountryWeight: Double = ..., nameElementSimilarityForHit: Double = ..., matchSelectionMode: Int = ..., checkDateForSearchHit: Boolean = ..., dateComparisonMethod: Int = ..., maxDateYearDifferenceForHit: Int = ..., checkCountryForSearchHit: Boolean = ..., similarityValueForSearchHit: Double = ..., numberOfHitsForSearchHit: Int = ..., maxNumberOfCandidatesFromSearch: Int = ..., searchEntityGroupMode: Int = ..., allowOneLetterAbbreviation: Boolean = ..., oneLetterAbbreviationWeight: Double = ..., checkCountyForAdressSearch: Boolean = ..., numberOfHitsForAddressSearchHit: Int = ..., fuzzyScoreForAddressSearch: Double = ...)

Class for the configuration of the normalizing and similarity stuff. Important: Make sure you use the same configuration for indexing and searching/comparing. Otherwise there may be unwanted side effects.

Class for the configuration of the normalizing and similarity stuff. Important: Make sure you use the same configuration for indexing and searching/comparing. Otherwise there may be unwanted side effects.

Value parameters

DateComparisonMethod

Method which date parts are to be compared. Currently only 0 = year is supported. Default is 0.

allowOneLetterAbbreviation

Defines whether abbreviations with a letter are taken into account. With true, for example, Benjamin is a hit with B. Default is false.

checkCountryForSearchHit

Defines whether the country should be considered or not. Default is true.

checkCountyForAdressSearch

Defines whether the country should be considered or not in address search. Countries overrules stop or hit words. Default is true.

checkDateForSearchHit

Defines whether the date should be taken into account or not. Default is true.

fuzzyScoreForAddressSearch

Value of the fuzziness to identify individual elements of an address as hits. Value between 0.1 and 1. Default is 0.8.

matchSelectionMode

Method of how a match has to be determined: 0 = Based on simialrity. 1 = Based on nofHits (number of hits). Default is 0.

maxDateYearDifferenceForHit

Defines the uncertainty/tolerance in the annual comparison in number of years (+/-). Default is 2.

maxNumberOfCandidatesFromSearch

Defines the maximum number of candidates to be considered by the IR search, from which hits are then determined. Default is 10000

nameElementSimilarityForHit

Minimum similarity to mark as hit. Default is 0.9.

normOrgCountryWeight

Weight (reduction) of a country match (recommended: < 1, default is 0.5).

normOrgLegalformWeight

Weight (reduction) of a legal form match (recommended: < 1, default is 0.25).

numberOfHitsForAddressSearchHit

Minimum number of elements that must be found for the address to be considered a hit. Default is 2.

numberOfHitsForSearchHit

Value of the nofHits (number of hits) from which the comparison is classified as a hit. Default is 2.

oneLetterAbbreviationWeight

If abbreviations are taken into account, this value defines the weight (reduction) of such a hit. Default is 0.5.

searchEntityGroupMode

Defines the field by which the hits are to be grouped. Depends which value is unique: 0 = externalId, 1 = Id. Default is 0.

similarityValueForSearchHit

Value of the similarity from which the comparison is classified as a hit. Default is 0.9.

Attributes

Supertypes
trait Serializable
trait Product
trait Equals
class Object
trait Matchable
class Any
Show all

Sugar object for creating SimilarityConfig when using Java. For using Scala "new SimilarityConfig()" is exactly the same.

Sugar object for creating SimilarityConfig when using Java. For using Scala "new SimilarityConfig()" is exactly the same.

Attributes

Supertypes
class Object
trait Matchable
class Any
Self type