The LIL team is excited to welcome Katy Gero, who joins us to investigate ethical language models for creative writing. Katy is a post-doc at the Variation Lab at Harvard SEAS, which is led by friend of LIL Elena Glassman.

As part of her work, Katy will be investigating under what circumstances, if any, literary writers would want their own work included as training data in a language model. Through interviews within literary communities and their adjacent fields, she intends to understand what kind of data collection processes and notions of consent are appropriate in these communities.

Secondarily, Katy hopes to collect and release an open-source dataset in the appropriate manner, based on connections made during interviews. Time permitting, we would then train a Transformer model and begin investigations into the utility of such a model compared to other available models. All findings and potential dataset outputs will be publicly available upon their completion.

Katy’s work is part of our ongoing investigations into corners of the emerging AI landscape. Her particular interest in the creative writing world is of course a key point of overlap with the interests of the library world, but themes like copyright, consent, and communal knowledge also echo our values as a lab. To learn more about our AI work, you can visit our website.