Fuzzy Matching Smart Way of Finding Similar Names Using Fuzzywuzzy
Matching strings should be one of the first natural language processing problem that human encounter since we start use computer to handle data. Unlike numerical value which has an exact logic to compare them, it is very hard to say how alike two strings are for a computer. One may compare them character by character and have an idea of how many characters in the pair of stings are the same. Unfortunately in most application we need computer to perceive strings like we do and therefore we have to use fuzzy matching. Fuzzy matching on names is never straight forward though, the definition of how “difference” of two names are really depends case by case. For example with restaurant names, matching of words like “cafe” “bar” and “restaurant” are consider less valuable then matching of some other less common words. Also, do we consider company names that matches partly (like “Happy Unicorn company” and Happy Unicorn co.”) are the same?
See more of my talks on YouTube.
Share this videohttps://cheuk.dev/videos/nraqijxazvw/
After having a career in data science, Cheuk now brings her knowledge in data and passion for the tech community as the developer advocate for Anaconda. Cheuk constantly contributes to the open-source community by giving free talks and tutorials and organize sprints to encourage diversity contributions.