CopyBench: Measuring Literal and Non-Literal Reproduction of Copyright-Protected Text in Language Model Generation.
Tong ChenAkari AsaiNiloofar MireshghallahSewon MinJames GrimmelmannYejin ChoiHannaneh HajishirziLuke ZettlemoyerPang Wei KohPublished in: CoRR (2024)
Keyphrases
- language generation
- computational linguistics
- text to speech synthesis
- human language
- english text
- information retrieval
- english language
- programming language
- language learning
- text mining
- textual data
- natural language
- database
- intellectual property rights
- keywords
- free text
- natural language processing
- specification language
- information extraction
- text understanding
- relational databases
- semantic representations
- language processing
- native language
- key concepts
- text retrieval