?url_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&rft.title=Human-Guided+Moral+Decision+Making+in+Text-Based+Games&rft.creator=Shi%2C+Z&rft.creator=Fang%2C+M&rft.creator=Chen%2C+L&rft.creator=Du%2C+Y&rft.creator=Wang%2C+J&rft.description=Training+reinforcement+learning+(RL)+agents+to+achieve+desired+goals+while+also+acting+morally+is+a+challenging+problem.+Transformer-based+language+models+(LMs)+have+shown+some+promise+in+moral+awareness%2C+but+their+use+in+different+contexts+is+problematic+because+of+the+complexity+and+implicitness+of+human+morality.+In+this+paper%2C+we+build+on+text-based+games%2C+which+are+challenging+environments+for+current+RL+agents%2C+and+propose+the+HuMAL+(Human-guided+Morality+Awareness+Learning)+algorithm%2C+which+adaptively+learns+personal+values+through+human-agent+collaboration+with+minimal+manual+feedback.+We+evaluate+HuMAL+on+the+Jiminy+Cricket+benchmark%2C+a+set+of+text-based+games+with+various+scenes+and+dense+morality+annotations%2C+using+both+simulated+and+actual+human+feedback.+The+experimental+results+demonstrate+that+with+a+small+amount+of+human+feedback%2C+HuMAL+can+improve+task+performance+and+reduce+immoral+behavior+in+a+variety+of+games%2C+and+is+adaptable+to+different+personal+values.&rft.publisher=Association+for+the+Advancement+of+Artificial+Intelligence+(AAAI)&rft.date=2024-03-25&rft.type=Proceedings+paper&rft.language=eng&rft.source=+++++In%3A++Proceedings+of+the+AAAI+Conference+on+Artificial+Intelligence.++(pp.+pp.+21574-21582).++Association+for+the+Advancement+of+Artificial+Intelligence+(AAAI)+(2024)+++++&rft.format=text&rft.identifier=https%3A%2F%2Fdiscovery.ucl.ac.uk%2Fid%2Feprint%2F10194860%2F1%2F30155-Article%2520Text-34209-1-2-20240324.pdf&rft.identifier=https%3A%2F%2Fdiscovery.ucl.ac.uk%2Fid%2Feprint%2F10194860%2F&rft.rights=open