eprintid: 10194860 rev_number: 7 eprint_status: archive userid: 699 dir: disk0/10/19/48/60 datestamp: 2024-07-19 13:46:58 lastmod: 2024-07-19 13:46:58 status_changed: 2024-07-19 13:46:58 type: proceedings_section metadata_visibility: show sword_depositor: 699 creators_name: Shi, Z creators_name: Fang, M creators_name: Chen, L creators_name: Du, Y creators_name: Wang, J title: Human-Guided Moral Decision Making in Text-Based Games ispublished: pub divisions: UCL divisions: B04 divisions: F48 note: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions. abstract: Training reinforcement learning (RL) agents to achieve desired goals while also acting morally is a challenging problem. Transformer-based language models (LMs) have shown some promise in moral awareness, but their use in different contexts is problematic because of the complexity and implicitness of human morality. In this paper, we build on text-based games, which are challenging environments for current RL agents, and propose the HuMAL (Human-guided Morality Awareness Learning) algorithm, which adaptively learns personal values through human-agent collaboration with minimal manual feedback. We evaluate HuMAL on the Jiminy Cricket benchmark, a set of text-based games with various scenes and dense morality annotations, using both simulated and actual human feedback. The experimental results demonstrate that with a small amount of human feedback, HuMAL can improve task performance and reduce immoral behavior in a variety of games, and is adaptable to different personal values. date: 2024-03-25 date_type: published publisher: Association for the Advancement of Artificial Intelligence (AAAI) official_url: http://dx.doi.org/10.1609/aaai.v38i19.30155 oa_status: green full_text_type: other language: eng primo: open primo_central: open_green verified: verified_manual elements_id: 2268263 doi: 10.1609/aaai.v38i19.30155 lyricists_name: Wang, Jun lyricists_id: JWANG00 actors_name: Wang, Jun actors_id: JWANG00 actors_role: owner full_text_status: public pres_type: paper publication: Proceedings of the AAAI Conference on Artificial Intelligence volume: 38 number: 19 pagerange: 21574-21582 event_title: The 38th Annual AAAI Conference on Artificial Intelligence issn: 2159-5399 book_title: Proceedings of the AAAI Conference on Artificial Intelligence citation: Shi, Z; Fang, M; Chen, L; Du, Y; Wang, J; (2024) Human-Guided Moral Decision Making in Text-Based Games. In: Proceedings of the AAAI Conference on Artificial Intelligence. (pp. pp. 21574-21582). Association for the Advancement of Artificial Intelligence (AAAI) Green open access document_url: https://discovery.ucl.ac.uk/id/eprint/10194860/1/30155-Article%20Text-34209-1-2-20240324.pdf