?url_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&rft.title=How+Decoding+Strategies+Affect+the+Verifiability+of+Generated+Text&rft.creator=Massarelli%2C+L&rft.creator=Petroni%2C+F&rft.creator=Piktus%2C+A&rft.creator=Ott%2C+M&rft.creator=Rockt%C3%A4schel%2C+T&rft.creator=Plachouras%2C+V&rft.creator=Silvestri%2C+F&rft.creator=Riedel%2C+S&rft.description=Language+models+are+of+considerable+importance.+They+are+used+for+pretraining%2C+finetuning%2C+and+rescoring+in+downstream+applications%2C+and+as+is+as+a+test-bed+and+benchmark+for+progress+in+natural+language+understanding.+One+fundamental+question+regards+the+way+we+should+generate+text+from+a+language+model.+It+is+well+known+that+different+decoding+strategies+can+have+dramatic+impact+on+the+quality+of+the+generated+text+and+using+the+most+likely+sequence+under+the+model+distribution%2C+e.g.%2C+via+beam+search%2C+generally+leads+to+degenerate+and+repetitive+outputs.%0D%0AWhile+generation+strategies+such+as+top-k+and+nucleus+sampling+lead+to+more+natural+and+less+repetitive+generations%2C+the+true+cost+of+avoiding+the+highest+scoring+solution+is+hard+to+quantify.+In+this+paper%2C+we+argue+that+verifiability%2C+i.e.%2C+the+consistency+of+the+generated+text+with+factual+knowledge%2C+is+a+suitable+metric+for+measuring+this+cost.+We+use+an+automatic+fact-checking+system+to+calculate+new+metrics+as+a+function+of+the+number+of+supported+claims+per+sentence+and+find+that+sampling-based+generation+strategies%2C+such+as+top-k%2C+indeed+lead+to+less+verifiable+text.+This+finding+holds+across+various+dimensions%2C+such+as+model+size%2C+training+data+size+and+parameters+of+the+generation+strategy.+Based+on+this+finding%2C+we+introduce+a+simple+and+effective+generation+strategy+for+producing+non-repetitive+and+more+verifiable+(in+comparison+to+other+methods)+text.&rft.publisher=Association+for+Computational+Linguistics&rft.date=2019-11-20&rft.type=Proceedings+paper&rft.language=eng&rft.source=+++++In%3A++Proceedings+of+the+Findings+of+the+Association+for+Computational+Linguistics%3A+EMNLP+2020.++(pp.+pp.+223-235).++Association+for+Computational+Linguistics+(2019)+++++&rft.format=text&rft.identifier=https%3A%2F%2Fdiscovery.ucl.ac.uk%2Fid%2Feprint%2F10086496%2F1%2FRocktaschel_2020.findings-emnlp.22.pdf&rft.identifier=https%3A%2F%2Fdiscovery.ucl.ac.uk%2Fid%2Feprint%2F10086496%2F&rft.rights=open