Katy Gero, Lecturer in Computer Science at the University of Sydney, has been investigating the implications of large language models in the creative writing field. During her time as a Fellow with the Library Innovation Lab, she conducted interviews within literary communities and their adjacent fields to understand what - if any - circumstances would lead literary writers to want their own work included as training data in a large language model. How would they want to be credited and/or compensated? Would they require restrictions on uses of the model? Are such restrictions feasible? What notion of consent is appropriate in this context?

She found that while most writers were opposed to the nonconsensual use of their work as training data for large language models, they were more concerned with (1) lack of respect towards themselves and their work, (2) industry impacts rather individual impacts, and (3) the power imbalances that resulted in a lack of agency over how their writing is being used. The resulting paper, “Creative Writers’ Attitudes on Writing as Training Data for Large Language Models,” won a Best Paper Award at the ACM Conference on Human Factors in Computing Systems.

Katy, along with other writers and technologists interested in this area, has been hosting workshops on how writers might regain agency in this situation by building their own language models for their own use. If you are interested in speaking with Katy or hearing more about her research, learn more at Katy’s website or reach out to katy.gero@sydney.edu.au.