Barker, Paul;
(1997)
Name Matching in an X.500 White Pages Directory.
Doctoral thesis (Ph.D), UCL (University College London).
Text
Name_matching_in_an_X.500_whit.pdf Download (7MB) |
Abstract
The expansion of data communications networks, the Internet in particular, has encouraged the development of a number of information retrieval services. One of the key services is a white pages directory service: a user provides the name and organisation of the person they are looking for, and the service returns details such as telephone and facsimile numbers, electronic and paper mail addresses, and so on. While many aspects of directory services have attracted a lot of research and design effort, there has been relatively little attention focused on the problem of how best to use the name matching facilities provided by the directory service systems to find the directory entries that users require. There are essentially three components to the name matching problem. First, we need to know how users formulate their queries: for example, do users tend to use full name forms, abbreviations or sets of initials. How much of a problem is misspelled input. Second, we also need to know the sort of names that directory administrators store in the directory: the directory's distributed management means that name formats vary from organisation to organisation. Third, a directory service provides a set of facilities for matching user input to directory names; which are the most effective facilities; can we devise strategies that deliver the correct results, and do so without returning many spurious results. The core work in this thesis is empirical. My experimentation is based on the use of X.500, the international standard for directory services. I have gathered data from the NameFLOW-Paradise directory, a well-established X.500 directory. I use query and directory name data taken from this service in a series of experiments to test various name matching strategies. The experiments are based on the facilities provided by X.500. The similarity between the name matching facilities provided by X.500 and other directory services means that the findings should be broadly applicable to non-X.500 directory services. This is particularly true of the study of approximate matching. Several directory services, including X.500, allow for approximate matching but do not specify which algorithm should be used to implement this type of matching. In practice. Soundex has been widely used. However, as many directory users and administrators have been unsatisfied with Soundex, I have investigated whether any alternative algorithms have better matching characteristics in the white pages paradigm.
Type: | Thesis (Doctoral) |
---|---|
Qualification: | Ph.D |
Title: | Name Matching in an X.500 White Pages Directory |
Open access status: | An open access version is available from UCL Discovery |
Language: | English |
Additional information: | Thesis digitised by ProQuest. |
URI: | https://discovery.ucl.ac.uk/id/eprint/10103837 |
Archive Staff Only
View Item |