Tan, Daniel CH;
Mccarthy, Robert;
Acero, Fernando;
Delfaki, Andromachi Maria;
Li, Zhibin;
Kanoulas, Dimitrios;
(2024)
Safe Value Functions: Learned Critics as Hard Safety Constraints.
In:
2024 IEEE 20th International Conference on Automation Science and Engineering (CASE).
(pp. pp. 2441-2448).
IEEE: Bari, Italy.
Preview |
Text
C49.pdf - Accepted Version Download (676kB) | Preview |
Abstract
In the domain of safety-critical applications, there is a pressing need for control methods that are not only scalable but also verifiable. Traditional control strategies, which rely on certification processes, often struggle to adapt to the complexity inherent in these systems. Conversely, while reinforcement learning (RL) techniques show promise in scaling effectively, their verifiability remains a significant challenge. Our research introduces a novel approach that bridges this gap by offering strong guarantees on constraint satisfaction for general dynamical systems, diverging from previous works that primarily focus on certification. Our study delves into the prerequisites for the verification of learned Value Functions (VFs) through the lens of Control Barrier Function (CBF) attributes. We leverage the foundational principles of safe VFs (SVFs) to design a reward mechanism that inherently guides the optimal VF to embody a CBF. Our approach allows the resulting VF to restrict subsequent policy actions to safe trajectories, in the context of complex control problems. Furthermore, we investigate the feasibility of conducting formal verification of VFs by exploiting CBF properties. This research marks a significant advancement towards achieving control methods that are both scalable to complex systems and amenable to rigorous verification processes. Through the integration of learning-based control with traditional safety guarantees, we pave the way for more reliable and efficient solutions in safety-critical applications. The code and supplementary video can be found under our webpage 1 .
Type: | Proceedings paper |
---|---|
Title: | Safe Value Functions: Learned Critics as Hard Safety Constraints |
Event: | 2024 IEEE 20th International Conference on Automation Science and Engineering (CASE) |
Dates: | 28 Aug 2024 - 1 Sep 2024 |
ISBN-13: | 979-8-3503-5851-3 |
Open access status: | An open access version is available from UCL Discovery |
DOI: | 10.1109/CASE59546.2024.10711661 |
Publisher version: | http://dx.doi.org/10.1109/case59546.2024.10711661 |
Language: | English |
Additional information: | This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions. |
Keywords: | Robot control; Process control; Reinforcement learning; Self-supervised learning; Robot sensing systems; Safety; Trajectory; Reliability; Certification; Formal verification |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science |
URI: | https://discovery.ucl.ac.uk/id/eprint/10201038 |
Archive Staff Only
View Item |