Already have a subscription? Log in
Lara Martin is a computing innovation fellow at the University of Pennsylvania. As part of her work teaching computers how to tell stories, a practice known as automated story generation, she has been investigating how computers could be taught to play Dungeons & Dragons. More generally, Martin’s work looks at natural language within human-centred artifical intelligence, focusing on how humans communicate to help design improved forms of AI.
Interview by Masoud GolsorkhiPortrait by Lamont Abrams
Masoud Golsorkhi What was the thinking behind and methodology of your research? Your name is associated with Dungeons & Dragons and AI, but why did you pursue that particular route in using AI to tell stories? We’re familiar with playing more restrictive, linear, rule-based games, but you’re trying for something quite different.
Lara Martin I come more from a background in speech technologies, but I got into story generation because I thought, “Wouldn’t it be cool to add speech into these AI storytellers?” Then I realised we’re not quite ready for speech yet, so we’ll start with the text first and I dove in. I didn’t think of it as Dungeons & Dragons at first; that came later. It was just about automated story generation, getting these computers to tell stories in a reasonable way. What’s cool about story generation as opposed to other natural language processing (NLP) areas of research is that it’s everywhere. And it encompasses all these challenging problems that we see elsewhere in NLP. It’s a really cool, rich area of research to which everyone can relate.
MG Computers are allegedly able to compose poetry quite successfully. So what is it specifically about storytelling that is such a challenge for learning machines?
LM One of the most challenging parts of storytelling in particular is that you need to convey long-term coherence. When you’re just having a conversation, there are a lot of dialogue systems, like, “Help me book a flight to London from Philadelphia.” And you talk to this agent and it helps you through that. Even though that might contain some long-term knowledge – it has to remember we’re working from Philadelphia – it’s still very constrained. We’re only talking about flights; it’s a very specific domain. It’s cool to think, “How can I get a system to tell a story about anything I want?” That’s really challenging and it’s something I’ve been trying to work towards. There are other NLP tasks, like “machine translation” – which is really just translation. That’s like, “How can I take this text and translate it into another language?” You don’t need much history for that. You don’t need to remember everything else that has been translated in the past, you just have to figure out what’s going on with this pair. Storytelling makes the problem much larger.
MG As I understand it, the strength of good translating machines is their ability to translate bigger and bigger blocks of text, and make sense of them by comparing them. Or by getting a human umpire to quality-check the accuracy, because that’s how they learn and improve. Is it about the relative bandwidth of experience that the machines can draw on or something else?
LM It’s a little more complicated than that. These more modern models can work with much bigger chunks of text. You would think that would be great for storytelling, and it works for short stories, but for the length of a chapter or a whole novel, it’s not going to keep that train of thought. If you just let them run, after a while these models end up producing streams of consciousness. The things they produce will make sense in the immediate history, but it’s not one cohesive text.
MG I interviewed Carlo Rovelli, the Italian physicist who wrote The Order of Time, and he convinced me that the idea of time is based on the fact that we are mortal, biological beings. We imagined this thing to exist, which physics can’t demonstrate, it’s just that we are terrible, biological sacks of material that deteriorate over time. I wondered, is it always going to be the human who determines whether something makes sense or not?
LM There could be things that the network is picking up on and putting together that we might not appreciate. That could absolutely happen, but – and this is what makes computational creativity difficult – you can’t have a creative object in a vacuum. There has to be an experiencer. Maybe if you had another AI that learned to appreciate these things… but even then, I can’t imagine it not being eventually consumed by a human, because we’re the ones with emotions and real-life experiences. We can look at these stories and relate to them, even though they might not have had anything to do with our real lives. We can make those connections that wouldn’t have been there otherwise.
MG When you involve human beings, you inevitably involve culture. So the question is: how do you separate your work from a cultural bias that determines your judgements as a human being? LM You don’t. Even if I have this giant corpus of data, that still contains this culture. It might be more of an international internet culture; that’s still a culture, and that’s often what these networks are trained on. The way to get around it is not to avoid culture, but to try to make it the least harmful as possible.
MG The reason you think there’s usefulness in computers telling stories is because we can have longer conversations with them, right?
LM Not just longer, but richer conversations and so have a deeper understanding. One of the examples that I like to give is if you’re planning a birthday party. You talk to this agent and it walks you through the steps of this birthday-party scenario, a story. You might interject and say, “I can’t have balloons because little Susie is allergic to latex.” The agent then learns to adapt its story and do different things. It’s a very natural way we have of communicating and by tapping into that we can improve communication with these agents, for want of a better term.
MG A friend of mine has tried to explain to me the idea of a neural network as a series of quite basic algorithms, where you put in variables and they give you an answer. The difference is that when you put in lots of them, occasionally something sort of magical happens – an unexpected consequence. Do we understand exactly what happens?
LM No, we do not understand neural networks. We understand how they’re made and what’s going through them and how they’re calculating things, but we don’t understand exactly what they’re learning. We make analogies to our own intelligence – like you can add “attention” to a neural network, where you tell it to pay more attention to the nouns in a sentence.
MG Even though we can nudge the action, we don’t know the mechanics of how the action comes about. Is that right?
LM You can think about it by going back to the brain analogy that it came from originally. There are these clusters of neurons that we have in our heads, and there’s no possible way of poking in someone’s brain and saying, “This is where their concept of ‘cat’ is.” That’s just not a thing that you can do. You can’t really do that in neural networks either, also because it’s so distributed.
MG I get the impression that for someone who works in the field of AI, you’re quite hesitant about its potential, at the same time as being excited by it. Is that fair?
LM I’m excited to see where the field is going, but at the same time, it’s very challenging. I can’t imagine ever getting to what’s called “general intelligence”. We’re not going to be able to create something that’s like a real living thing – not at least with the types of methods that we have right now. And all tools can be used for evil depending on whose hands they’re in. A lot of people who have good intentions might still be creating things that don’t turn out well – like Microsoft’s Tay.
MG But like God, you eventually have to let Adam and Eve go off into the world and see what they come up with. Maybe you just have to take the risk – or do you think that the risks are too high?
LM If you have that kind of mentality, you could have a somewhat working self-driving car and you put it out “in the wild”, and it’ll just go around hitting people and things. You don’t want that kind of behaviour. It’s not like a living thing where it’s trying to do something. It has no needs or wants or anything; it’s just following what you told it to do. Since what we’re telling these systems to do is getting more and more complicated, there can be more and more room for error. We might not see all the places where it can mess up.
MG I bow to your better judgement! Do you think this wave of interest in AI and its potential applications over the past few years will recede? Are people going to lose interest?
LM It’s possible, but I don’t think that’s where we’re headed. We’re not getting less technology as time goes on; things are getting more and more sophisticated, and as things get more sophisticated, they need more sophisticated software. Will conversational agents like Siri or Alexa go out of style? Possibly. Siri was a really great marketing campaign; the way it was set up really got people interested in using it. At first it started out as a novelty, but then all these systems – especially Echo and Google Home – gave us more reason to use them. But we’ve had this vision for a while. Just think back to the 1950s idea of the “home of the future”, where you talk to your oven. We’re getting close to that, but there’s also the question of whether people will want that. That’s kind of what happened with Google Glass. It wasa really cool piece of technology, but not enough people really wanted it and it was ahead of its time. That’s the tricky part with technology – not only does it have to be useful, it also has to be understandable to people.
MG Do you think the invention of neural networks has improved cognitive science in a fundamental way?
LM That’s a good question. I would say it’s definitely improved cogno-science. AI is one of the six or so sub-fields of cogno-science. Even though neural networks are more of a metaphorical way of looking at brains, I still think they’re valuable because they give us another way of looking at how people think, and how information is learned and distributed.
MG It offers a parallel paradigm.
LM Yes, but a way more simplistic one. ◉