%V 8645
%S Lecture Notes in Computer Science
%P 74-81
%D 2014
%C Cham, Switzerland
%K Phonetic Similarity, String Similarity, Fast Search
%N PART 2
%B Database and Expert Systems Applications
%T Fast phonetic similarity search over large repositories
%X Analysis of unstructured data may be inefficient in the presence of spelling errors. Existing approaches use string similarity methods to search for valid words within a text, with a supporting dictionary. However, they are not rich enough to encode phonetic information to assist the search. In this paper, we present a novel approach for efficiently perform phonetic similarity search over large data sources, that uses a data structure called PhoneticMap to encode language-specific phonetic information. We validate our approach through an experiment over a data set using a Portuguese variant of a well-known repository, to automatically correct words with spelling errors.
%I Springer
%L discovery10065716
%A H Tissot
%A G Peschl
%A MD Del Fabro
%J Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
%O This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.