Python: find closest string (from a list) to another string
Python: find closest string (from a list) to another string
Use difflib.get_close_matches
.
>>> words = [hello, Hallo, hi, house, key, screen, hallo, question, format]
>>> difflib.get_close_matches(Hello, words)
[hello, Hallo, hallo]
Please look at the documentation, because the function returns 3 or less closest matches by default.
There is an awesome article with a complete source code (21 lines) provided by Peter Norvig on spelling correction.
http://norvig.com/spell-correct.html
The idea is to build all possible edits of your word,
hello - helo - deletes
hello - helol - transpose
hello - hallo - replaces
hello - heallo - inserts
def edits1(word):
splits = [(word[:i], word[i:]) for i in range(len(word) + 1)]
deletes = [a + b[1:] for a, b in splits if b]
transposes = [a + b[1] + b[0] + b[2:] for a, b in splits if len(b)>1]
replaces = [a + c + b[1:] for a, b in splits for c in alphabet if b]
inserts = [a + c + b for a, b in splits for c in alphabet]
return set(deletes + transposes + replaces + inserts)
Now, look up each of these edits in your list.
Peters article is a great read and worth reading.
Python: find closest string (from a list) to another string
Create a sorted list of your words and use the bisect module to identify the point in the sorted list where your word would fit according to the sorting order. Based on that position you can give the k nearest neighbours above and below to find the 2k closest words.