?url_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&rft.title=Prioritized+Level+Replay&rft.creator=Jiang%2C+M&rft.creator=Grefenstette%2C+E&rft.creator=Rockt%C3%A4schel%2C+T&rft.description=Environments+with+procedurally+generated+content+serve+as+important+benchmarks+for+testing+systematic+generalization+in+deep+reinforcement+learning.+In+this+setting%2C+each+level+is+an+algorithmically+created+environment+instance+with+a+unique+configuration+of+its+factors+of+variation.+Training+on+a+prespecified+subset+of+levels+allows+for+testing+generalization+to+unseen+levels.+What+can+be+learned+from+a+level+depends+on+the+current+policy%2C+yet+prior+work+defaults+to+uniform+sampling+of+training+levels+independently+of+the+policy.+We+introduce+Prioritized+Level+Replay+(PLR)%2C+a+general+framework+for+selectively+sampling+the+next+training+level+by+prioritizing+those+with+higher+estimated+learning+potential+when+revisited+in+the+future.+We+show+TD-errors+effectively+estimate+a+level%E2%80%99s+future+learning+potential+and%2C+when+used+to+guide+the+sampling+procedure%2C+induce+an+emergent+curriculum+of+increasingly+difficult+levels.+By+adapting+the+sampling+of+training+levels%2C+PLR+significantly+improves+sample-efficiency+and+generalization+on+Procgen+Benchmark%E2%80%94matching+the+previous+state-of-the-art+in+test+return%E2%80%94and+readily+combines+with+other+methods.+Combined+with+the+previous+leading+method%2C+PLR+raises+the+state-of-the-art+to+over+76%25+improvement+in+test+return+relative+to+standard+RL+baselines.&rft.publisher=PMLR%3A+Proceedings+of+Machine+Learning+Research&rft.contributor=Meila%2C+M&rft.contributor=Zhang%2C+T&rft.date=2021&rft.type=Proceedings+paper&rft.language=eng&rft.source=+++++In%3A+Meila%2C+M+and+Zhang%2C+T%2C+(eds.)+Proceedings+of+the+38th+International+Conference+on+Machine+Learning.++(pp.+pp.+4940-4950).++PMLR%3A+Proceedings+of+Machine+Learning+Research%3A+Online+Only.+(2021)+++++&rft.format=text&rft.identifier=https%3A%2F%2Fdiscovery.ucl.ac.uk%2Fid%2Feprint%2F10132236%2F1%2Fjiang21b.pdf&rft.identifier=https%3A%2F%2Fdiscovery.ucl.ac.uk%2Fid%2Feprint%2F10132236%2F&rft.rights=open