UCL logo

UCL Discovery

UCL home » Library Services » Electronic resources » UCL Discovery

Advanced learning algorithms for cross-language patent retrieval and classification

Li, Y; Shawe-Taylor, J; (2007) Advanced learning algorithms for cross-language patent retrieval and classification. INFORM PROCESS MANAG , 43 (5) 1183 - 1199. 10.1016/j.ipm.2006.11.005.

Full text not available from this repository.

Abstract

We study several machine learning algorithms for cross-language patent retrieval and classification. In comparison with most of other studies involving machine learning for cross-language information retrieval, which basically used learning techniques for monolingual sub-tasks, our learning algorithms exploit the bilingual training documents and learn a semantic representation from them. We study Japanese-English cross-language patent retrieval using Kernel Canonical Correlation Analysis (KCCA), a method of correlating linear relationships between two variables in kernel defined feature spaces. The results are quite encouraging and are significantly better than those obtained by other state of the art methods. We also investigate learning algorithms for cross-language document classification. The learning algorithm are based on KCCA and Support Vector Machines (SVM). In particular, we study two ways of combining the KCCA and SVM and found that one particular combination called SVM_2k achieved better results than other learning algorithms for either bilingual or monolingual test documents. (c) 2006 Elsevier Ltd. All rights reserved.

Type:Article
Title:Advanced learning algorithms for cross-language patent retrieval and classification
DOI:10.1016/j.ipm.2006.11.005
Keywords:machine learning, cross-language patent retrieval, cross-language document classification
UCL classification:UCL > School of BEAMS > Faculty of Engineering Science > Computer Science

Archive Staff Only: edit this record