1
A Robot Walks into a Bar: Can Language Models Serve as Creativity Support Tools for Comedy? An Evaluation of LLMs' Humour Alignment with Comedians
arxiv.orgWe interviewed twenty professional comedians who perform live shows in front of audiences and who use artificial intelligence in their artistic process as part of 3-hour workshops on ``AI x Comedy'' conducted at the Edinburgh Festival Fringe in August 2023 and online. The workshop consisted of a comedy writing session with large language models (LLMs), a human-computer interaction questionnaire to assess the Creativity Support Index of AI as a writing tool, and a focus group interrogating the comedians' motivations for and processes of using AI, as well as their ethical concerns about bias, censorship and copyright. Participants noted that existing moderation strategies used in safety filtering and instruction-tuned LLMs reinforced hegemonic viewpoints by erasing minority groups and their perspectives, and qualified this as a form of censorship. At the same time, most participants felt the LLMs did not succeed as a creativity support tool, by producing bland and biased comedy tropes, akin to ``cruise ship comedy material from the 1950s, but a bit less racist''. Our work extends scholarship about the subtle difference between, one the one hand, harmful speech, and on the other hand, ``offensive'' language as a practice of resistance, satire and ``punching up''. We also interrogate the global value alignment behind such language models, and discuss the importance of community-based value alignment and data ownership to build AI tools that better suit artists' needs.
I’ve been experimenting on creative writing tools with a bunch of writer friends, and the setup described in this paper is notoriously shit. I mean they come up to ChatGPT on v3.5 (or Bard lmao) and expect it to write comedy ? Jeez talk about setting yourself up for failure. That’s like walking up to a junior screenwriter and yelling “GIVE ME A JOKE” to them. I don’t understand why people keep repeating that mistake, they design experiments where they expect the model to be the source of creativity but that’s just stupid.
If you want to get output that is not entirely mediocre, you need something like a Dramatron architecture where you decouple various task (fleshing out characters, outlining at the episode level, outlining at the scene level, writing dialogues etc…) and maintain internal memory of what is being worked on. It is non-trivial to setup but it gets there sometimes - even the authors of this paper recognize that this would have probably produced better results. You also need a user able to provide good ideas that the model can work with, you can’t expect the good creative stuff to come from the robot.
Instinctively i’d say you have to treat the model like your own junior writer, and how do you make a junior writer useful ? By teaching them to “yes, and” in a writing room with better writers (in this case, the user). In that context, with a good experienced user at the helm, it can definitely bring value. Nothing groundbreaking but i can see how a refined version of this could help, notably with consistency, story beats, pacing, the boring stuff. GPTs are better critics than they are writers anyway.
That being said i never really pursued “pure comedy” on LLMs as it sounds like a lost battle. In my mind it’s kind of like tickling : if a machine pokes your ribs you don’t get the tickles, that only works when a human does it. I doubt they can fix that in the short or mid term.
That’s a lot of words to say, “You’re holding it wrong.”
More like “you’re trying to paint with a hammer AND you’re holding it wrong”
Yeah, these things are supposed to be good at writing, aren’t they?
Define “good at writing”. Good comedy is very difficult to attain and none of the models are anywhere near it, including the more recent ones.
I don’t want to.
I concur.
a true conversationalist lmao you’re doing great buddy
Dunno what you want me to say. Define the vague concept of “good writing”?
The linked study finds that ChatGPT 3.5 and Bard suck at writing comedy. You claim in so many words that this should be obvious (along with a really dubious claim that machines can’t tickle people for some reason). I’m also not surprised that these models are terrible at writing comedy, because even at best of times I find their output bland, trite and crudely stripped of anything potentially divisive.
However, lots of people seem to think that LLMs are good at writing related tasks, so I don’t think it’s inherently obvious that these tools suck at writing comedy in particular.
All these words make this reply much less fun to write.
oh, you were looking for the lmao conversations room? you missed the turn: it’s the last door inside clown school. you’re not even in the right building atm!
hey you’re the creepy guy who reads comment history before replying to a conversation aren’t you ?
I thought the point of posting your ideas on a public forum was to have people read them.
nah, there’s nothing creepier than giving some shithead the common courtesy of checking their post history to see if they’re somehow like this all the time or if they’re just having a particularly bad night
gonna be honest, I didn’t give this one that common courtesy cause once they get to the stage where they creepjacket other posters for looking at their previous terrible posts, whatever Reddit has done to their brain is severe and irreversible
this isn’t Oprah Ft Holographic Dr Phil, stop projecting
I merely looked after you went 3 for 3 on idiotic posts
oh yeah silly me. That’s definitely not creepy 👍
I don’t understand why you’re getting downvoted. You should read the room, delete your posts, and leave forever. Then you wouldn’t be getting downvoted.
Am i getting downvoted ? It says 3 upvotes / 0 downvotes on my end.
Here, have another invisible downvote.
it had such strong “well all the people in my town don’t seem to have a problem with me” energy
(I can see it happening if they ignore offsite downvotes on that lemmy, but yeah)
Idk if downvotes don’t federate at all or if it’s homegrown jank, but I’ve never seen a downvote on another instance’s post.
I like how you lose faith in your argument the longer your post goes on. Maybe start with the last sentence next time.
No i’m saying comedy (as in writing your jokes for you) is not something you should expect from language models. As a general rule, there is no tool that will make you a good writer, only (potentially) tools that can help you do more with your qualities as a writer. But it will never be funnier or more talented than you are.
That’s why i personally experiment with writing tools. Writing standup is one thing, but imagine you’re writing a sitcom or any form of serialized work. That’s a lot of fucking work and obviously if you’re starting out you can’t exactly afford to pay for assistant writers to do the menial labour that comes with it. Language models can come in handy in that scenario, but again you can’t expect them to be the genius in the room if you want a good show you have to bring the good ideas and the funnies. It’s a power tool and power tools don’t draw the plans for the house they just grind where you need grinding.
Give this promptfucker the props they deserve: usually they don’t just come out and say it.
comment history also includes simulation hypothesis and some very eagleflavoured political analysis
I have a prediction!
What’s “eagleflavoured” ?
you would think someone who experiments with creative writing “tools” might understand imagery, but when those “tools” are in fact just 3 GPTs in a trenchcoat, it’s not surprising when they don’t
I legit don’t get it. Is it about the US ? I mostly speak about France in my political comments so i’m not sure where they are going with that.
look, maybe the old adage “takes one to know one” could be disproven by this lack of recognition of tools. there might be a paper here!
Hey, want some comedy advice? Read the room.
I want to point out that this interminable motherfucker introduced themselves as someone who supposedly does creative writing