%0 Generic %A Tissot, H %A Peschl, G %A Del Fabro, MD %C Cham, Switzerland %D 2014 %F discovery:10065716 %I Springer %K Phonetic Similarity, String Similarity, Fast Search %N PART 2 %P 74-81 %T Fast phonetic similarity search over large repositories %U https://discovery.ucl.ac.uk/id/eprint/10065716/ %V 8645 %X Analysis of unstructured data may be inefficient in the presence of spelling errors. Existing approaches use string similarity methods to search for valid words within a text, with a supporting dictionary. However, they are not rich enough to encode phonetic information to assist the search. In this paper, we present a novel approach for efficiently perform phonetic similarity search over large data sources, that uses a data structure called PhoneticMap to encode language-specific phonetic information. We validate our approach through an experiment over a data set using a Portuguese variant of a well-known repository, to automatically correct words with spelling errors. %Z This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.