UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Safe Value Functions: Learned Critics as Hard Safety Constraints

Tan, Daniel CH; Mccarthy, Robert; Acero, Fernando; Delfaki, Andromachi Maria; Li, Zhibin; Kanoulas, Dimitrios; (2024) Safe Value Functions: Learned Critics as Hard Safety Constraints. In: 2024 IEEE 20th International Conference on Automation Science and Engineering (CASE). (pp. pp. 2441-2448). IEEE: Bari, Italy. Green open access

[thumbnail of C49.pdf]
Preview
Text
C49.pdf - Accepted Version

Download (676kB) | Preview

Abstract

In the domain of safety-critical applications, there is a pressing need for control methods that are not only scalable but also verifiable. Traditional control strategies, which rely on certification processes, often struggle to adapt to the complexity inherent in these systems. Conversely, while reinforcement learning (RL) techniques show promise in scaling effectively, their verifiability remains a significant challenge. Our research introduces a novel approach that bridges this gap by offering strong guarantees on constraint satisfaction for general dynamical systems, diverging from previous works that primarily focus on certification. Our study delves into the prerequisites for the verification of learned Value Functions (VFs) through the lens of Control Barrier Function (CBF) attributes. We leverage the foundational principles of safe VFs (SVFs) to design a reward mechanism that inherently guides the optimal VF to embody a CBF. Our approach allows the resulting VF to restrict subsequent policy actions to safe trajectories, in the context of complex control problems. Furthermore, we investigate the feasibility of conducting formal verification of VFs by exploiting CBF properties. This research marks a significant advancement towards achieving control methods that are both scalable to complex systems and amenable to rigorous verification processes. Through the integration of learning-based control with traditional safety guarantees, we pave the way for more reliable and efficient solutions in safety-critical applications. The code and supplementary video can be found under our webpage 1 .

Type: Proceedings paper
Title: Safe Value Functions: Learned Critics as Hard Safety Constraints
Event: 2024 IEEE 20th International Conference on Automation Science and Engineering (CASE)
Dates: 28 Aug 2024 - 1 Sep 2024
ISBN-13: 979-8-3503-5851-3
Open access status: An open access version is available from UCL Discovery
DOI: 10.1109/CASE59546.2024.10711661
Publisher version: http://dx.doi.org/10.1109/case59546.2024.10711661
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: Robot control; Process control; Reinforcement learning; Self-supervised learning; Robot sensing systems; Safety; Trajectory; Reliability; Certification; Formal verification
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10201038
Downloads since deposit
3Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item