“A letter may be coded, and a word may be coded. A theatrical performance may be coded, and a sonnet may be coded, and there are times when it seems the entire world is in code.”
This piece of philosophy comes from one of my favourite childhood authors, and it’s one which can often provide some comfort when the world feels mysterious and unreasonable: It’s not that the world doesn’t make sense, it’s that the sense it makes is obscured by a layer of puzzles and codes just waiting for you to figure them out.
That’s if you read the quote as referring to the more general meaning of the word ‘code’. For my exam in computational linguistics, I decided to take a more literal approach to it. Specifically, I’ve given a new meaning to the line “A sonnet may be coded”: I have written a piece of computer code capable of writing poetry of its own.
For those who know programming, this will seem relatively trivial. There are already lots of computer programmers who are teaching computers to create poetry with far more nuance and skill than anything I can do after my few months of playing around with Python (which in this context is the name of a coding language, making it only slightly more intimidating than an actual python snake). But if you’re a language nerd with no coding experience, and if you’re wondering what computer programming even has to do with linguistics, I’d like to show you my project as an example of how to make computers Do Things With Words.
A quick note before I begin describing my code: I’m writing this mainly for those who know nothing about programming (e.g. myself a year ago), so I won’t get into the specifics of coding in Python. So if you’re afraid that I’ll use terms like ‘integer’ or ‘coding environment’, don’t worry – I’m writing this specifically for you!
When I said that my code knows how to write poetry, I may have exaggerated slightly. In my exam paper, I define a poem as “a text which consists of generally accepted lexical items and scans to an established poetic meter”. Critically minded readers might notice that this definition makes no promise of either syntax or meaning, which, as any linguist can tell you, are fairly important parts of language. That’s because my code does in fact leave those parts out completely, producing poems such as:
Descend parade this parrot rat declare,
if prance profound compare two made is flare
Most of the poems are in a similar vein – texts which might as well have come from the nightmares of first-year morphosyntax students (imagine having to draw a sentence diagram of this!). But while the poems are complete nonsense, they do follow particular patterns – in this case, the pattern of a couplet, a poem consisting of two lines with a rhyme at the end.
For a text to fit into a pattern like that, it needs three things. First, it needs to have the correct number of syllables. Second, it needs to have stress on the right syllables. And third, it needs to rhyme in exactly the right places (in this case, at the end of each line). What I set out to do was to write a piece of code – a so-called function – to create texts that do those three things; and frankly, I knew from the beginning that also having the texts make sense was way above my level of experience.
So how was I going to do that? My function is made out of three parts. The first part contains information about all the different kinds of poems I wanted the function to know. I’ve taught it five different types of poems, each with different patterns of syllables, stress and rhymes.
The second part is the vocabulary. For this part, I basically just made a big box and filled it with words. But for each of these words, I attached a few pieces of information: How many syllables is the word, which syllable is stressed, and what does it rhyme with?
The third part is the function itself. The function starts by asking a question: Which kind of poem should I write? The user answers this question by selecting one of the five kinds of poems I taught it. The function then looks through every single word in the vocabulary, and for each word it asks: Does this word fit in this kind of poem?
After the function has discarded all the words that don’t fit, it grabs random words from what’s left and fits them into the poem. And on a very basic level, that’s how the function works.
I say “on a very basic level”, because after I’d done this, I did run into a few issues. The first poem I printed looked like this:
As struck with poetic awe as you are right now, you’d probably agree that this poem would look better if there were spaces between the words. But even after adding that, there’s still a few issues with the poem:
‘astound achoo achoo achoo astound /
subdue achoo profound profound astound’
Apart from the fact that the function seems to have suffered an allergic reaction in the middle of the first line, the biggest problem with this poem is that when the function had to pick a word to rhyme with ‘astound’, it just picked ‘astound’ again. While that’s technically a rhyme, it’s not a particularly impressive one, so I decided to add a bit of code to the function to prevent the function from rhyming a word with itself unless absolutely necessary (as it turns out, the function interprets ‘not unless absolutely necessary’ as ‘sometimes, but only in limericks’).
The sneeze attack happened because I hadn’t taught the function very many different words at this point, so it did the best it could to fill the poem by repeating the same words several times. But the limited vocabulary also showed in other ways. When the function couldn’t find any words to fit in a poem, I had programmed it to print the letter X for each syllable of that word – a small x for an unstressed syllable, and a capital X for a stressed one. So for example, this alleged limerick:
‘xXxxXxachoo / xXxxXxsubdue / xXxprofound / xXxastound / xXxxXxsubdue’
taught me that I was severely lacking in words of the xXx variety. At this point, briefly forgetting that my improvised way of encoding syllable structure was not universally embraced, I decided to find more words of this type by making an online search for ‘syllable structure xXx’. A word of advice: Be careful when searching for anything involving a sequence of three X’s, because as it turns out, search engines have a particular way of interpreting that.
Eventually, I managed to tweak the function to look for different syllable structures (the trick was to make it look for two short words rather than one long), bypassing the need for xXx words. This is just one instance of the kind of issue you might run into while coding, and a good example of my experience with writing my own function.
With the function finally finished, I just want to share some of its work with you. You can follow this link to try out the function yourself, but let me show you a few examples of the kinds of poems it can make. Apart from couplets, the function can write beautiful Shakespearean verses:
forbade conflate finance finance jubee!
hot carrot front-gate front-gate parrot plea
was where unpacked allot unpacked the snacked
the compact decent carrot carrot stacked
…somewhat disturbing limericks:
extract musketeer concentrate
aware contemplate laminate
a rat circumvent
dessert of lament
not carrot content here inflate
…high-tempo iambic octameters (these can be sung to the tune of Gilbert & Sullivan’s Modern Major General Song):
demand enhance cannot of carrot steer if true delayed portrayed
jubee! conflate descent instate subdue parade jubee! instate
…and deceptively simple haiku:
aware and profound
contemplate quacked au contraire
Turns out my function really likes carrots!
So what good does all of this do for linguistics as a science? None. But as someone who loves playing with language, it’s been fun doing so in an indirect way, teaching a computer to solve simple language ‘puzzles’ like this.
Besides, I’m also fascinated by the way humans tend to react to these poems. There is demonstrably no specific meaning encoded into these poems, and yet the human mind insists on trying to find it anyway. Our pattern-seeking brains can accept incorrect conjugations and nonsensical scenarios if it’s in service of filling out the blanks and getting some kind – any kind – of meaning from a text. A friend of mine used the function to generate the haiku: “young frequent enough, counteract robot contract, millionaire this clicked,” and interpreted from it an entire story about the career struggles of a poor young robot. This isn’t a function of my code, of course, but a function of the human mind. I’m just excited that my code manages to bring out that experience!
Professionally, there are linguists who use programming to analyse large collections of data, or who make functions that can recognise spoken language – for example, computational linguists are responsible for the voice recognition software that lets you use voice commands on your phone.
Professional programmers have also been able to make much cooler versions of the kind of program I’ve made here. For example, take Hafez, a function created in 2016 (Ghazvininejad et al). Hafez can write a poem about any topic you give it, and these poems will also mostly be syntactically coherent – and, like mine, they fit into particular poetic meters. This is just one example of how the field of teaching language to machines is developing, including language ‘games’ such as poetry, and it’s something I think any linguist would benefit from dipping their toes into – if not for the tools it lets you create, then just to experience a new direction from which to approach language.
Ghazvininejad, Marjan, Xing Shi, Yejin Choi & Kevin Knight. 2016. Generating Topical Poetry. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Pp. 1183–1191
Gustav Styrbjørn Johannessen is an MA student of Linguistics at Aarhus University. Between this project and his participation in Lingoslam 2022, this semester has been a particularly poetic one for him – not that he’s complaining! His professional long-term goal is cracking the code to showing the general public just how exciting language and linguistics can be.
2 thoughts on “Rhyme and Reason”
You might be interested in a package called “pronouncing”, which apparently could save you a lot of tagging in the future: https://pypi.org/project/pronouncing/#description
Thank you for the suggestion – that looks very helpful!
In my exam paper I argued that I couldn’t just use an existing library because I needed my function’s vocabulary to have syllable count and rhymes encoded as well as the words themselves. But now that I’ve received my grade for the course, it’s probably safe to admit that I stuck with the manual tagging mostly just because, as someone new to programming, doing it in a more hands-on manner seemed more fun. If I develop this any further, I’ll definitely try to incorporate Pronouncing to save time.