Bošnjak, M;
              
      
            
                Rocktäschel, T;
              
      
            
                Naradowsky, J;
              
      
            
                Riedel, S;
              
      
        
        
  
(2017)
  Programming with a differentiable forth interpreter.
    
    
      In: 
      5th International Conference on Learning Representations, ICLR 2017 - Workshop Track Proceedings.
      
      
    
 International Conference on Representation Learning (ICRL): Toulon, France.
  
  
       
    
  
| Preview | Text 1605.06640.pdf - Accepted Version Download (740kB) | Preview | 
Abstract
There are families of neural networks that can learn to compute any function, provided sufficient training data. However, given that in practice training data is scarce for all but a small set of problems, a core question is how to incorporate prior knowledge into a model. Here we consider the case of prior procedural knowledge, such as knowing the overall recursive structure of a sequence transduction program or the fact that a program will likely use arithmetic operations on real numbers to solve a task. To this end we present a differentiable interpreter for the programming language Forth. Through a neural implementation of the dual stack machine that underlies Forth, programmers can write program sketches with slots that can be filled with behaviour trained from program input-output data. As the program interpreter is end-to-end differentiable, we can optimize this behaviour directly through gradient descent techniques on user specified objectives, and also integrate the program into any larger neural computation graph. We show empirically that our interpreter is able to effectively leverage different levels of prior program structure and learn complex transduction tasks such as sequence sorting or addition with substantially less data and better generalisation over problem sizes. In addition, we introduce neural program optimisations based on symbolic computation and parallel branching that lead to significant speed improvements.
| Type: | Proceedings paper | 
|---|---|
| Title: | Programming with a differentiable forth interpreter | 
| Event: | 5th International Conference on Learning Representations (ICLR 2017) | 
| Open access status: | An open access version is available from UCL Discovery | 
| Language: | English | 
| Additional information: | This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions. | 
| UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science | 
| URI: | https://discovery.ucl.ac.uk/id/eprint/10081239 | 
Archive Staff Only
|  | View Item | 
 
                      
