<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en-gb">
	<link rel="self" type="application/atom+xml" href="https://forum.eggheads.org/app.php/feed/topic/17044" />

	<title>egghelp/eggheads community</title>
	<subtitle>Discussion of eggdrop bots, shell accounts and tcl scripts.</subtitle>
	<link href="https://forum.eggheads.org/index.php" />
	<updated>2009-07-25T17:22:44-04:00</updated>

	<author><name><![CDATA[egghelp/eggheads community]]></name></author>
	<id>https://forum.eggheads.org/app.php/feed/topic/17044</id>

		<entry>
		<author><name><![CDATA[MenzAgitat]]></name></author>
		<updated>2009-07-25T17:22:44-04:00</updated>

		<published>2009-07-25T17:22:44-04:00</published>
		<id>https://forum.eggheads.org/viewtopic.php?p=89658#p89658</id>
		<link href="https://forum.eggheads.org/viewtopic.php?p=89658#p89658"/>
		<title type="html"><![CDATA[[script/library] Levenshtein's distance v1.0]]></title>

		
		<content type="html" xml:base="https://forum.eggheads.org/viewtopic.php?p=89658#p89658"><![CDATA[
 <br> <br>This script provides package <strong class="text-strong">Levenshtein</strong> :<div class="codebox"><p>Code: </p><pre><code>Package provide Levenshtein 1.0</code></pre></div><br><span style="font-size:150%;line-height:116%"><span style="text-decoration:underline">Description</span>:</span><br><br>In information theory and computer science, the Levenshtein distance is a metric for measuring the amount of difference between two sequences (i.e., the so called edit distance). The Levenshtein distance between two strings is given by the minimum number of operations needed to transform one string into the other, where an operation is an insertion, deletion, or substitution of a single character. A generalization of the Levenshtein distance (Damerau–Levenshtein distance) allows the transposition of two characters as an operation.<br><br>( see <a href="http://en.wikipedia.org/wiki/Levenshtein_distance" class="postlink">http://en.wikipedia.org/wiki/Levenshtein_distance</a> )<br><br><br><span style="font-size:150%;line-height:116%"><span style="text-decoration:underline">Interest</span></span>:<br><ul><li>Allows an orthographical corrector to suggest alternate words with a low Levenshtein distance.</li><li>Allows pseudo-AI to have an orthographical tolerance.</li><li> ...</li></ul><br><span style="font-size:150%;line-height:116%"><span style="text-decoration:underline">Syntax</span>:</span><br><br><em class="text-italics">levenshtein::distance &lt;string 1&gt; &lt;string 2&gt;</em><br><br>you can also use a public command if you want to test things :<br><br><em class="text-italics">!test_levenshtein &lt;string 1&gt; &lt;string 2&gt;</em><br><br><br><span style="font-size:150%;line-height:116%"><span style="text-decoration:underline">Examples (in french, sorry)</span>:</span><div class="codebox"><p>Code: </p><pre><code>levenshtein::distance "bonjour" "bougeoir"-&gt; 4</code></pre></div>you must manipulate 4 characters to transform the word "bonjour" into the word "bougeoir" :<ul><li><strong class="text-strong">BONJOUR</strong></li><li><strong class="text-strong">BOUJOUR</strong> -&gt; we replace N by U</li><li><strong class="text-strong">BOUGOUR</strong> -&gt; we replace J by G</li><li><strong class="text-strong">BOUGEOUR</strong> -&gt; we insert E</li><li><strong class="text-strong">BOUGEOIR</strong> -&gt; we replace U by I</li></ul><div class="codebox"><p>Code: </p><pre><code>levenshtein::distance "antiquaire" "antikaire"-&gt; 2</code></pre></div>We can conclude from this 10 letters long example that it is very similar to the second word, with a distance of only 2.<br>You must keep in mind that a distance of 2 between two words of 10 letters means they are very similar, while a distance of 2 between two words of 3 letters means they are very different as you can see in the following example :<div class="codebox"><p>Code: </p><pre><code>levenshtein::distance "pin" "pas"-&gt; 2</code></pre></div>As you can see, a distance of 2 doesnt mean much difference in a 10 letters word but represents important modifications in a 3 letters one.<br>In order to preserve relevance of results, you'll take care to always link the tolerance to the length, proportionately.<div class="codebox"><p>Code: </p><pre><code>levenshtein::distance "antiquaire" "dimanche"-&gt; 8</code></pre></div>In this last example, we can see that the distance between the first word and the second is 8. They are very different words.<br><br><br><span style="font-size:150%;line-height:116%"><span style="text-decoration:underline">Download</span>:</span><br><br><a href="http://www.boulets-roxx.com/buffer/Levenshtein.tcl" class="postlink"><span style="text-decoration:underline">Levenshtein's distance  v1.0</span></a><span style="font-size:150%;line-height:116%"></span><span style="font-size:150%;line-height:116%"></span><ul></ul><p>Statistics: Posted by <a href="https://forum.eggheads.org/memberlist.php?mode=viewprofile&amp;u=7933">MenzAgitat</a> — Sat Jul 25, 2009 5:22 pm</p><hr />
]]></content>
	</entry>
	</feed>
