<> <http://www.w3.org/2000/01/rdf-schema#comment> "The repository administrator has not yet configured an RDF license."^^<http://www.w3.org/2001/XMLSchema#string> . <> <http://xmlns.com/foaf/0.1/primaryTopic> <https://discovery.ucl.ac.uk/id/eprint/10132236> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/ontology/bibo/Article> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://purl.org/dc/terms/title> "Prioritized Level Replay"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://purl.org/ontology/bibo/abstract> "Environments with procedurally generated content serve as important benchmarks for testing systematic generalization in deep reinforcement learning. In this setting, each level is an algorithmically created environment instance with a unique configuration of its factors of variation. Training on a prespecified subset of levels allows for testing generalization to unseen levels. What can be learned from a level depends on the current policy, yet prior work defaults to uniform sampling of training levels independently of the policy. We introduce Prioritized Level Replay (PLR), a general framework for selectively sampling the next training level by prioritizing those with higher estimated learning potential when revisited in the future. We show TD-errors effectively estimate a level’s future learning potential and, when used to guide the sampling procedure, induce an emergent curriculum of increasingly difficult levels. By adapting the sampling of training levels, PLR significantly improves sample-efficiency and generalization on Procgen Benchmark—matching the previous state-of-the-art in test return—and readily combines with other methods. Combined with the previous leading method, PLR raises the state-of-the-art to over 76% improvement in test return relative to standard RL baselines."^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://purl.org/dc/terms/date> "2021" . <https://discovery.ucl.ac.uk/id/document/1350697> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/ontology/bibo/Document> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://purl.org/ontology/bibo/volume> "139" . <https://discovery.ucl.ac.uk/id/org/ext-7847d4d9e80a0eac828e0aaf09f24b1c> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Organization> . <https://discovery.ucl.ac.uk/id/org/ext-7847d4d9e80a0eac828e0aaf09f24b1c> <http://xmlns.com/foaf/0.1/name> "PMLR: Proceedings of Machine Learning Research"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://purl.org/dc/terms/publisher> <https://discovery.ucl.ac.uk/id/org/ext-7847d4d9e80a0eac828e0aaf09f24b1c> . <https://discovery.ucl.ac.uk/id/publication/ext-097a6c5f7bb750d74ba5f2486cd46cc8> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/ontology/bibo/Collection> . <https://discovery.ucl.ac.uk/id/publication/ext-097a6c5f7bb750d74ba5f2486cd46cc8> <http://xmlns.com/foaf/0.1/name> "ICML"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://purl.org/dc/terms/isPartOf> <https://discovery.ucl.ac.uk/id/publication/ext-097a6c5f7bb750d74ba5f2486cd46cc8> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://purl.org/ontology/bibo/status> <http://purl.org/ontology/bibo/status/published> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://purl.org/dc/terms/creator> <https://discovery.ucl.ac.uk/id/person/ext-c49eca0d6116e1ca4a9d1327aeb61781> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://purl.org/ontology/bibo/authorList> <https://discovery.ucl.ac.uk/id/eprint/10132236#authors> . <https://discovery.ucl.ac.uk/id/eprint/10132236#authors> <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1> <https://discovery.ucl.ac.uk/id/person/ext-c49eca0d6116e1ca4a9d1327aeb61781> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://purl.org/dc/terms/creator> <https://discovery.ucl.ac.uk/id/person/ext-928c3e594b025f0ad4b65ac9583b4540> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://purl.org/ontology/bibo/authorList> <https://discovery.ucl.ac.uk/id/eprint/10132236#authors> . <https://discovery.ucl.ac.uk/id/eprint/10132236#authors> <http://www.w3.org/1999/02/22-rdf-syntax-ns#_2> <https://discovery.ucl.ac.uk/id/person/ext-928c3e594b025f0ad4b65ac9583b4540> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://purl.org/dc/terms/creator> <https://discovery.ucl.ac.uk/id/person/ext-f62653d603b1209acc1813d1830e9700> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://purl.org/ontology/bibo/authorList> <https://discovery.ucl.ac.uk/id/eprint/10132236#authors> . <https://discovery.ucl.ac.uk/id/eprint/10132236#authors> <http://www.w3.org/1999/02/22-rdf-syntax-ns#_3> <https://discovery.ucl.ac.uk/id/person/ext-f62653d603b1209acc1813d1830e9700> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://www.loc.gov/loc.terms/relators/EDT> <https://discovery.ucl.ac.uk/id/person/ext-17d46fbb8369d7c9d6519207e70a1782> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://purl.org/ontology/bibo/editorList> <https://discovery.ucl.ac.uk/id/eprint/10132236#editors> . <https://discovery.ucl.ac.uk/id/eprint/10132236#editors> <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1> <https://discovery.ucl.ac.uk/id/person/ext-17d46fbb8369d7c9d6519207e70a1782> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://www.loc.gov/loc.terms/relators/EDT> <https://discovery.ucl.ac.uk/id/person/ext-b4b8595c85b96756cdd440831b2199a2> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://purl.org/ontology/bibo/editorList> <https://discovery.ucl.ac.uk/id/eprint/10132236#editors> . <https://discovery.ucl.ac.uk/id/eprint/10132236#editors> <http://www.w3.org/1999/02/22-rdf-syntax-ns#_2> <https://discovery.ucl.ac.uk/id/person/ext-b4b8595c85b96756cdd440831b2199a2> . <https://discovery.ucl.ac.uk/id/person/ext-b4b8595c85b96756cdd440831b2199a2> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> . <https://discovery.ucl.ac.uk/id/person/ext-b4b8595c85b96756cdd440831b2199a2> <http://xmlns.com/foaf/0.1/givenName> "T"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/person/ext-b4b8595c85b96756cdd440831b2199a2> <http://xmlns.com/foaf/0.1/familyName> "Zhang"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/person/ext-b4b8595c85b96756cdd440831b2199a2> <http://xmlns.com/foaf/0.1/name> "T Zhang"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/person/ext-17d46fbb8369d7c9d6519207e70a1782> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> . <https://discovery.ucl.ac.uk/id/person/ext-17d46fbb8369d7c9d6519207e70a1782> <http://xmlns.com/foaf/0.1/givenName> "M"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/person/ext-17d46fbb8369d7c9d6519207e70a1782> <http://xmlns.com/foaf/0.1/familyName> "Meila"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/person/ext-17d46fbb8369d7c9d6519207e70a1782> <http://xmlns.com/foaf/0.1/name> "M Meila"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/person/ext-928c3e594b025f0ad4b65ac9583b4540> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> . <https://discovery.ucl.ac.uk/id/person/ext-928c3e594b025f0ad4b65ac9583b4540> <http://xmlns.com/foaf/0.1/givenName> "E"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/person/ext-928c3e594b025f0ad4b65ac9583b4540> <http://xmlns.com/foaf/0.1/familyName> "Grefenstette"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/person/ext-928c3e594b025f0ad4b65ac9583b4540> <http://xmlns.com/foaf/0.1/name> "E Grefenstette"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/person/ext-c49eca0d6116e1ca4a9d1327aeb61781> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> . <https://discovery.ucl.ac.uk/id/person/ext-c49eca0d6116e1ca4a9d1327aeb61781> <http://xmlns.com/foaf/0.1/givenName> "M"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/person/ext-c49eca0d6116e1ca4a9d1327aeb61781> <http://xmlns.com/foaf/0.1/familyName> "Jiang"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/person/ext-c49eca0d6116e1ca4a9d1327aeb61781> <http://xmlns.com/foaf/0.1/name> "M Jiang"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/person/ext-f62653d603b1209acc1813d1830e9700> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> . <https://discovery.ucl.ac.uk/id/person/ext-f62653d603b1209acc1813d1830e9700> <http://xmlns.com/foaf/0.1/givenName> "T"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/person/ext-f62653d603b1209acc1813d1830e9700> <http://xmlns.com/foaf/0.1/familyName> "Rocktäschel"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/person/ext-f62653d603b1209acc1813d1830e9700> <http://xmlns.com/foaf/0.1/name> "T Rocktäschel"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://eprints.org/ontology/EPrint> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://eprints.org/ontology/ProceedingsSectionEPrint> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://purl.org/dc/terms/isPartOf> <https://discovery.ucl.ac.uk/id/repository> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://eprints.org/ontology/hasDocument> <https://discovery.ucl.ac.uk/id/document/1350697> . <https://discovery.ucl.ac.uk/id/document/1350697> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://eprints.org/ontology/Document> . <https://discovery.ucl.ac.uk/id/document/1350697> <http://www.w3.org/2000/01/rdf-schema#label> "Prioritized Level Replay (Text)"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://purl.org/dc/elements/1.1/hasVersion> <https://discovery.ucl.ac.uk/id/document/1350697> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://eprints.org/ontology/hasPublished> <https://discovery.ucl.ac.uk/id/document/1350697> . <https://discovery.ucl.ac.uk/id/document/1350697> <http://eprints.org/ontology/hasFile> <https://discovery.ucl.ac.uk/id/eprint/10132236/1/jiang21b.pdf> . <https://discovery.ucl.ac.uk/id/document/1350697> <http://purl.org/dc/terms/hasPart> <https://discovery.ucl.ac.uk/id/eprint/10132236/1/jiang21b.pdf> . <https://discovery.ucl.ac.uk/id/eprint/10132236/1/jiang21b.pdf> <http://www.w3.org/2000/01/rdf-schema#label> "jiang21b.pdf"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://eprints.org/ontology/hasDocument> <https://discovery.ucl.ac.uk/id/document/1350699> . <https://discovery.ucl.ac.uk/id/document/1350699> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://eprints.org/ontology/Document> . <https://discovery.ucl.ac.uk/id/document/1350699> <http://www.w3.org/2000/01/rdf-schema#label> "Prioritized Level Replay (Other)"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/document/1350699> <http://eprints.org/relation/isVersionOf> <https://discovery.ucl.ac.uk/id/document/1350697> . <https://discovery.ucl.ac.uk/id/document/1350699> <http://eprints.org/relation/isVolatileVersionOf> <https://discovery.ucl.ac.uk/id/document/1350697> . <https://discovery.ucl.ac.uk/id/document/1350699> <http://eprints.org/relation/ispreviewThumbnailVersionOf> <https://discovery.ucl.ac.uk/id/document/1350697> . <https://discovery.ucl.ac.uk/id/document/1350699> <http://eprints.org/ontology/hasFile> <https://discovery.ucl.ac.uk/id/eprint/10132236/3/preview.jpg> . <https://discovery.ucl.ac.uk/id/document/1350699> <http://purl.org/dc/terms/hasPart> <https://discovery.ucl.ac.uk/id/eprint/10132236/3/preview.jpg> . <https://discovery.ucl.ac.uk/id/eprint/10132236/3/preview.jpg> <http://www.w3.org/2000/01/rdf-schema#label> "preview.jpg"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://eprints.org/ontology/hasDocument> <https://discovery.ucl.ac.uk/id/document/1350700> . <https://discovery.ucl.ac.uk/id/document/1350700> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://eprints.org/ontology/Document> . <https://discovery.ucl.ac.uk/id/document/1350700> <http://www.w3.org/2000/01/rdf-schema#label> "Prioritized Level Replay (Other)"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/document/1350700> <http://eprints.org/relation/isVersionOf> <https://discovery.ucl.ac.uk/id/document/1350697> . <https://discovery.ucl.ac.uk/id/document/1350700> <http://eprints.org/relation/isVolatileVersionOf> <https://discovery.ucl.ac.uk/id/document/1350697> . <https://discovery.ucl.ac.uk/id/document/1350700> <http://eprints.org/relation/ismediumThumbnailVersionOf> <https://discovery.ucl.ac.uk/id/document/1350697> . <https://discovery.ucl.ac.uk/id/document/1350700> <http://eprints.org/ontology/hasFile> <https://discovery.ucl.ac.uk/id/eprint/10132236/4/medium.jpg> . <https://discovery.ucl.ac.uk/id/document/1350700> <http://purl.org/dc/terms/hasPart> <https://discovery.ucl.ac.uk/id/eprint/10132236/4/medium.jpg> . <https://discovery.ucl.ac.uk/id/eprint/10132236/4/medium.jpg> <http://www.w3.org/2000/01/rdf-schema#label> "medium.jpg"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://eprints.org/ontology/hasDocument> <https://discovery.ucl.ac.uk/id/document/1350701> . <https://discovery.ucl.ac.uk/id/document/1350701> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://eprints.org/ontology/Document> . <https://discovery.ucl.ac.uk/id/document/1350701> <http://www.w3.org/2000/01/rdf-schema#label> "Prioritized Level Replay (Other)"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/document/1350701> <http://eprints.org/relation/isVersionOf> <https://discovery.ucl.ac.uk/id/document/1350697> . <https://discovery.ucl.ac.uk/id/document/1350701> <http://eprints.org/relation/isVolatileVersionOf> <https://discovery.ucl.ac.uk/id/document/1350697> . <https://discovery.ucl.ac.uk/id/document/1350701> <http://eprints.org/relation/issmallThumbnailVersionOf> <https://discovery.ucl.ac.uk/id/document/1350697> . <https://discovery.ucl.ac.uk/id/document/1350701> <http://eprints.org/ontology/hasFile> <https://discovery.ucl.ac.uk/id/eprint/10132236/5/small.jpg> . <https://discovery.ucl.ac.uk/id/document/1350701> <http://purl.org/dc/terms/hasPart> <https://discovery.ucl.ac.uk/id/eprint/10132236/5/small.jpg> . <https://discovery.ucl.ac.uk/id/eprint/10132236/5/small.jpg> <http://www.w3.org/2000/01/rdf-schema#label> "small.jpg"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://eprints.org/ontology/hasDocument> <https://discovery.ucl.ac.uk/id/document/1350702> . <https://discovery.ucl.ac.uk/id/document/1350702> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://eprints.org/ontology/Document> . <https://discovery.ucl.ac.uk/id/document/1350702> <http://www.w3.org/2000/01/rdf-schema#label> "Prioritized Level Replay (Other)"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/document/1350702> <http://eprints.org/relation/isVersionOf> <https://discovery.ucl.ac.uk/id/document/1350697> . <https://discovery.ucl.ac.uk/id/document/1350702> <http://eprints.org/relation/isVolatileVersionOf> <https://discovery.ucl.ac.uk/id/document/1350697> . <https://discovery.ucl.ac.uk/id/document/1350702> <http://eprints.org/relation/islightboxThumbnailVersionOf> <https://discovery.ucl.ac.uk/id/document/1350697> . <https://discovery.ucl.ac.uk/id/document/1350702> <http://eprints.org/ontology/hasFile> <https://discovery.ucl.ac.uk/id/eprint/10132236/6/lightbox.jpg> . <https://discovery.ucl.ac.uk/id/document/1350702> <http://purl.org/dc/terms/hasPart> <https://discovery.ucl.ac.uk/id/eprint/10132236/6/lightbox.jpg> . <https://discovery.ucl.ac.uk/id/eprint/10132236/6/lightbox.jpg> <http://www.w3.org/2000/01/rdf-schema#label> "lightbox.jpg"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://eprints.org/ontology/hasDocument> <https://discovery.ucl.ac.uk/id/document/1350703> . <https://discovery.ucl.ac.uk/id/document/1350703> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://eprints.org/ontology/Document> . <https://discovery.ucl.ac.uk/id/document/1350703> <http://www.w3.org/2000/01/rdf-schema#label> "Prioritized Level Replay (Other)"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/document/1350703> <http://eprints.org/relation/isVersionOf> <https://discovery.ucl.ac.uk/id/document/1350697> . <https://discovery.ucl.ac.uk/id/document/1350703> <http://eprints.org/relation/isVolatileVersionOf> <https://discovery.ucl.ac.uk/id/document/1350697> . <https://discovery.ucl.ac.uk/id/document/1350703> <http://eprints.org/relation/isIndexCodesVersionOf> <https://discovery.ucl.ac.uk/id/document/1350697> . <https://discovery.ucl.ac.uk/id/document/1350703> <http://eprints.org/ontology/hasFile> <https://discovery.ucl.ac.uk/id/eprint/10132236/7/indexcodes.txt> . <https://discovery.ucl.ac.uk/id/document/1350703> <http://purl.org/dc/terms/hasPart> <https://discovery.ucl.ac.uk/id/eprint/10132236/7/indexcodes.txt> . <https://discovery.ucl.ac.uk/id/eprint/10132236/7/indexcodes.txt> <http://www.w3.org/2000/01/rdf-schema#label> "indexcodes.txt"^^<http://www.w3.org/2001/XMLSchema#string> . <https://discovery.ucl.ac.uk/id/eprint/10132236> <http://www.w3.org/2000/01/rdf-schema#seeAlso> <https://discovery.ucl.ac.uk/id/eprint/10132236/> . <https://discovery.ucl.ac.uk/id/eprint/10132236/> <http://purl.org/dc/elements/1.1/title> "HTML Summary of #10132236 \n\nPrioritized Level Replay\n\n" . <https://discovery.ucl.ac.uk/id/eprint/10132236/> <http://purl.org/dc/elements/1.1/format> "text/html" . <https://discovery.ucl.ac.uk/id/eprint/10132236/> <http://xmlns.com/foaf/0.1/primaryTopic> <https://discovery.ucl.ac.uk/id/eprint/10132236> .