UCL logo

UCL Discovery

UCL home » Library Services » Electronic resources » UCL Discovery

Proceedings paper #79148

Saunders, C; Shawe-Taylor, J; Vinokourov, A; (2003) UNSPECIFIED In: (Proceedings) String Kernels, Fisher Kernels and Finite State Automata.

Full text not available from this repository.


In this paper we show how the generation of documents can be thought of as a k-stage Markov process, which leads to a Fisher kernel from which the n-gram and string kernels can be re-constructed. The Fisher kernel view gives a more flexible insight into the string kernel and suggests how it can be parametrised in a way that reflects the statistics of the training corpus. Furthermore, the probabilistic modelling approach suggests extending the Markov process to consider sub-sequences of varying length, rather than the standard fixed-length approach used in the string kernel. We give a procedure for determining which sub-sequences are informative features and hence generate a Finite State Machine model, which can again be used to obtain a Fisher kernel. By adjusting the parametrisation we can also influence the weighting received by the features. In this way we are able to obtain a logarithmic weighting in a Fisher kernel. Finally, experiments are reported comparing the different kernels using the standard Bag of Words kernel as a baseline

Type: Proceedings paper
Event: String Kernels, Fisher Kernels and Finite State Automata
Keywords: String Kernels
UCL classification: UCL > School of BEAMS
UCL > School of BEAMS > Faculty of Engineering Science
URI: http://discovery.ucl.ac.uk/id/eprint/79148
Downloads since deposit
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item