Music Score Representation of Poetry Reading: Can Prosody Be Studied by Analyzing the Author’s Voice?

I propose to study prosody of modern poems by analyzing recordings of their authors’ voice reciting their poems and producing forms similar to a music score. An example is shown in Figure 1. In the following, such a notation will simply be called the music score of a poetry reading.

Figure 1.

First I will review developments in three partly overlapping areas related to this study: relationship of poetry recitation to prosody, studies of recorded voice, and prosody of free verse. Then I will describe the computer-assisted method for producing the music score by analyzing recorded voice. Some influential papers have asserted that prosody can’t be studied by analyzing the voice. I will discuss their arguments and show some results indicating that, for many modern poets, analyzing their voice can be an effective way of studying their prosody.

Until about half a century ago, linguists, prosodists, psychologists, and sometimes poets actively participated in prosodic studies using human reading of poems. In the mid-1910s, Patterson developed a device called a ‘voice recorder’ that seems to have been a voice amplitude recorder, and poet Amy Lowell used this equipment to measure, in effect, time between stressed syllables in readings of her own and others’ poems. In 1956, Seymour Chatman analyzed recordings of eight people, including the poet himself, reading Frost’s ‘Mowing’. However, those studies were severely attacked by critics W. K. Wimsatt Jr. and Monroe C. Beardsley (1959). They believed that there is an ‘abstract metrical pattern’ in a poem that cannot be studied by analyzing a performance or a set of performances: ‘There are many performances of the same poem—differing among themselves in many ways. A performance is an event, but the poem itself, if there is any poem, must be some kind of enduring object’ (587). Shortly afterwards, linguist Roman Jakobson (1960) concurred, stating that ‘meter, or in more explicit terms, verse design, underlies any single line or, in logical terminology, any single verse instance’ and ‘A variation of verse instances within a given poem must be strictly distinguished from the variable delivery instances’ (365–66). After this, according to Paul Kiparsky (2010), ‘Generative metrics has turned Jakobson’s distinction into an exclusion: Delivery (recitation and text-setting) is not considered to be in the province of metrics’ (slide 6). Among literary prosodists, too, not many have ventured into serious analysis of poets’ recorded voice.

In the 1970s, however, Reuven Tsur started to radically change the notion of ‘prosody’. He has formulated an extensive ‘Cognitive Poetics’, encompassing phonology, prosody, and philosophy. More recently, Paul Kiparsky has proposed to expand metrics into ‘broad metrics’, which includes ‘delivery’ (2010). Among literary prosodists, Harvey Gross (1964) has used T. S. Eliot’s recordings for the analysis of his prosody. Poet and scholar Charles Bernstein edited the important book Close Listening (1998a), wherein Bernstein stressed the importance of performance (vocal and print) in prosody. More recently, Marjorie Perloff and Craig Dworkin edited another important book, The Sound of Poetry / The Poetry of Sound (2009).

In the related area of sound study, Jason Camlot (2003), Derek Furr (2010), and Matthew Rubery (2011) have been studying audiobooks and other collections of recordings. Recently, an inter-disciplinary area of ‘voice studies’ seems to be forming.

Although Wimsatt and Beardsley expressed hope that traditional metrical analysis would be applicable to modernist and later poems, it turned out to be difficult. Several prosodists, including Derek Attridge (1995) and Alan Holder (1995), have been developing prosody for non-metrical verse. Literary critics Helen Vendler (2007) and Marjorie Perloff (2009) have also analyzed prosody in several modernist and, in the case of Perloff, postmodernist poems with non-metrical rhythms, sometimes listening to the poet’s recorded voice.

I hope the method described in the following will be usable in the above fields of study. First, the recorded voice is analyzed by using a combination of Wavesurfer and PRAAT speech-analysis software packages. The reason for using two rather similar packages is to take advantage of the user-friendly human interface of Wavesurfer and also the superior pitch analysis and scripting capabilities of PRAAT. The human operator first segments the recorded voice data into syllables and silent/breathing segments, and inputs them into Wavesurfer. The data is imported to PRAAT, and the human operator inputs stress pattern (whether the syllable is heard as stressed or not) and also the line number, in the poem, to which the syllable belongs. After this the analysis is almost automatic, and the following data are obtained from PRAAT: the starting time and duration of each syllable or silence, and the vocal power (loudness) for a short time segment. A perl program was written to convert this into an input program for LilyPond, a software for producing scores. The average over the syllable duration of fundamental formant f0 is taken as the pitch, which determines the position of the note. The duration is converted to the note length (assuming MM=120). The longtime average of power is converted into dynamics. A stressed syllable can be represented by an accent mark next to the note, by a colored note, or both. The resultant music score of poetry recitation will be shown, synchronized with the voice, at the conference.

The music score obtained as above is, of course, only an approximation of the original recorded voice. It is known that ‘pitch’ is a psychological quantity and does not always correspond to f0. Also, f0 of talking/reciting voice usually changes continually, and its average over a time segment is not an accurate representation. The perception of a stress on a particular syllable is also known to be a psychological (physically difficult to measure) phenomenon, and its presence or absence is sometimes quite difficult to judge, and can be controversial. However, by using music score notation for representing the human voice reciting poetry, much relevant information (not only the stress pattern) is conveniently concentrated in a small space, facilitating discussion.

The above-mentioned arguments by Wimsatt and Beardsley (1959) and by Jakobson (1960) rely on two assumptions. First, they assume that poetry reading is a performance, and a performance can change arbitrarily and cannot be trusted, even if the performer is the author. Second, they assume that for a poem there is one correct ‘absolute meter’ or ‘verse instance’, which is, in Wimsatt and Beardsley’s terms, ‘something that can be read and studied with the help of grammars and dictionaries and other linguistic guides . . . The meter is something which for the most part inheres in language precisely at that level of linguistic organization which grammars and dictionaries and elementary rhetoric can successfully cope with’ (588). So it can and should be elucidated easily, without going so far as to study any ‘performance’.

To see if the first assumption holds, I have studied stress patterns in multiple recordings of the same poem by poets such as W. B. Yeats, Robert Frost, William Carlos Williams, Ezra Pound, T. S. Eliot, Kenneth Rexroth, W. H. Auden, Allen Ginsberg, and Galway Kinnell. I also analyzed stress patterns of non-authors reading works of those poets and compared them with the author’s reading. The result indicates that the author’s stress pattern is quite stable over years (though there are exceptions), while non-author readers (including poets) tend to read the poem in quite different stress patterns (especially in modern poems that do not have classical metric patterns) from the author’s and from each other’s. That stands to reason, as the stress pattern perhaps is one of the important characteristics in composing a poem, even when (or perhaps the more when) the poem does not have classical meter. In other words, for many authors, the stress pattern may be too important to change arbitrarily (though sometimes change may occur due to the change of heart of the author). On the other hand, the reader often fails to guess the author’s intended rhythm (or find another rhythm) in a modern poem. In short, the first assumption does not seem to hold, at least for many poets reading their own poems.

The second point, the assumption of the presence of one ‘absolute meter’ per poem, is difficult to apply to modern poems such as free verse and syllabic verse. For them, modern prosody and sound/voice studies will be able to contribute much, although as Bernstein notes, quoting Peter Quartermain, ‘The poet’s voicing of a poem should not be allowed to eliminate ambiguous voicings in the text; nor should the author’s performance of a poem be absolutely privileged over that of other readers and performers’ (11).

The method of analyzing the voice of the poet and the other readers given in this paper will, I hope, contribute to those studies. Relevant software developed in this study (perl programs and PRAAT scripts) will be made available to researchers in the near future.

Appendix A

  1. Attridge, D. (1995). Poetic Rhythm: An Introduction. Cambridge University Press, New York..
  2. Bernstein, C. (ed.). (1998a). Close Listening: Poetry and the Performed Word. Oxford University Press, Oxford.
  3. Bernstein, C. (1998b). Introduction. In Bernstein, C. (ed.), Close Listening: Poetry and the Performed Word. Oxford: Oxford University Press, pp. 3–26.
  4. Bernstein, C. (1999). My Way: Speeches and Poems. University of Chicago Press, Chicago.
  5. Boersma, P. and Weenink, D. (n.d.). PRAAT: Doing Phonetics by Computer,
  6. Camlot, J. (2003). Early Talking Books: Spoken Recordings and Recitation Anthologies, 1880–1920. Book History, 6: 147–73.
  7. Camlot, J. (2012). The Sound of Canadian Modernisms: The Sir George Williams University Poetry Series, 1966–74. Journal of Canadian Studies / Revue d’études canadiennes, 46(3): 28–59.
  8. Chatman, S. (1956). Robert Frost’s ‘Mowing’: An Inquiry into Prosodic Structure. Kenyon Review, 18(3): 421–38.
  9. Clement, T. E. (2008). ‘A Thing Not Beginning and Not Ending’: Using Digital Tools to Distant-Read Gertrude Stein’s The Making of Americans. Literary and Linguistic Computing, 23(3): 361–81.
  10. Clement, T., et al. (2013). Sounding for Meaning: Using Theories of Knowledge Representation to Analyze Aural Patterns in Texts. Digital Humanities Quarterly, 7(1),
  11. Furr, D. (2010). Recorded Poetry and Poetic Reception from Edna Millay to the Circle of Robert Lowell. Palgrave, New York.
  12. Gross, H. (1964). Sound and Form in Modern Poetry: A Study of Prosody from Thomas Hardy to Robert Lowell. University of Michigan Press, Ann Arbor.
  13. Holder, A. (1995). Rethinking Meter: A New Approach to the Verse Line. Yale University Press, Cranbury, NJ.
  14. Jakobson, R. (1960). Closing Statement: Linguistics and Poetics. In Sebeok, T. A. (ed.), Style in Language. New York: Wiley, pp. 350–77.
  15. Kiparsky, P. (2010). Meter and Performance. Proc. LSA Metrics Symposium, 8 January,
  16. Lowell, A. (1920). Some Musical Analogies in Modern Poetry. Musical Quarterly, 6(1): 127–57.
  17. Moretti, F. (2000). Conjectures on World Literature. New Left Review, 1 (January–February): 54–68.
  18. Nienhyus, H.-W. and Nieuwenhuizen, J. (n.d.). LilyPond: Music Notation for Everyone,
  19. Perloff, M. (2009). Introduction: The Sound of Poetry. In Perloff, M. and Dworkin, C. (eds), The Sound of Poetry / The Poetry of Sound. Chicago: University of Chicago Press, pp. 1–9.
  20. Perloff, M. and Dworkin, C. (eds). (2009). The Sound of Poetry / The Poetry of Sound. University of Chicago Press, Chicago.
  21. Quartermain, P. (1998). Sound Reading. In Bernstein, C. (ed.), Close Listening: Poetry and the Performed Word. Oxford: Oxford University Press, pp. 217–30.
  22. Rubery, M. (ed. and intro.). (2011). Audiobooks, Literature, and Sound Studies. Routledge, New York.
  23. Sjölander, K. and Beskow, J. (2000). Wavesurfer—An Open Source Speech Tool. Proceedings of ICSLP 2000, 6th International Conference on Spoken Language Processing, Beijing.
  24. Tsur, L. (1977). A Perception-Oriented Theory of Metre. Porter Israeli Institute for Poetics and Semiotics, Tel Aviv.
  25. Tsur, L. (1992a). Toward a Theory of Cognitive Poetics. North-Holland, Amsterdam.
  26. Tsur, L. (1992b). What Makes Sound Patterns Expressive? The Poetic Mode of Speech Perception. Duke University Press, Durham, NC.
  27. Tsur, L. (2012a). Playing by Ear and the Tip of the Tongue: Precategorial Information in Poetry. John Benjamins, Amsterdam.
  28. Tsur, L. (2012b). Poetic Rhythm: Structure and Performance, an Empirical Study in Cognirive Poetics. Sussex Academic Press, Brighton.
  29. Vendler, H. (2007). Our Secret Discipline: Yeats and Lyric Form. Oxford University Press, Oxford.
  30. Wimsatt, W. K., Jr., and Beardsley, M. C. (1959). The Concept of Meter: An Excercise in Abstraction. PMLA, 74(5): 585–98.
Takeo Yamamoto (, Graduate School of Literature and Sociology, The University of Tokyo, Japan