View source code
Display the source code in std/numeric.d from which this page was generated on github.
Report a bug
If you spot a problem with this page, click here to create a Bugzilla issue.
Improve this page
Quickly fork, edit online, and submit a pull request for this page. Requires a signed-in GitHub account. This works well for small changes. If you'd like to make larger changes you may want to consider using local clone.

Function std.numeric.gapWeightedSimilarityNormalized

The similarity per gapWeightedSimilarity has an issue in that it grows with the lengths of the two strings, even though the strings are not actually very similar. For example, the range ["Hello", "world"] is increasingly similar with the range ["Hello", "world", "world", "world",...] as more instances of "world" are appended. To prevent that, gapWeightedSimilarityNormalized computes a normalized version of the similarity that is computed as gapWeightedSimilarity(s, t, lambda) / sqrt(gapWeightedSimilarity(s, t, lambda) * gapWeightedSimilarity(s, t, lambda)). The function gapWeightedSimilarityNormalized (a so-called normalized kernel) is bounded in [0, 1], reaches 0 only for ranges that don't match in any position, and 1 only for identical ranges.

Select!(isFloatingPoint!F,F,double) gapWeightedSimilarityNormalized(alias comp, R1, R2, F) (
  R1 s,
  R2 t,
  F lambda,
  F sSelfSim = F.init,
  F tSelfSim = F.init
)
if (isRandomAccessRange!R1 && hasLength!R1 && isRandomAccessRange!R2 && hasLength!R2);

The optional parameters sSelfSim and tSelfSim are meant for avoiding duplicate computation. Many applications may have already computed gapWeightedSimilarity(s, s, lambda) and/or gapWeightedSimilarity(t, t, lambda). In that case, they can be passed as sSelfSim and tSelfSim, respectively.

Example

import std.math.operations : isClose;
import std.math.algebraic : sqrt;

string[] s = ["Hello", "brave", "new", "world"];
string[] t = ["Hello", "new", "world"];
writeln(gapWeightedSimilarity(s, s, 1)); // 15
writeln(gapWeightedSimilarity(t, t, 1)); // 7
writeln(gapWeightedSimilarity(s, t, 1)); // 7
assert(isClose(gapWeightedSimilarityNormalized(s, t, 1),
                7.0 / sqrt(15.0 * 7), 0.01));

Authors

Andrei Alexandrescu, Don Clugston, Robert Jacques, Ilya Yaroshenko

License

Boost License 1.0.