Compact System

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Tuesday, 5 October 2010

Poetic Machine Translation

Posted on 15:30 by Unknown
Posted by Dmitriy Genzel, Software Engineer

Once upon a midnight dreary, long we pondered weak and weary,
Over many a quaint and curious volume of translation lore.
When our system does translation, lifeless prose is its creation;
Making verse with inspiration no machine has done before.
So we want to boldly go where no machine has gone before.
Quoth now Google, "Nevermore!"

Robert Frost once said, “Poetry is what gets lost in translation.” Translating poetry is a very hard task even for humans, and is clearly beyond the capability of current machine translation systems. We therefore, out of academic curiosity, set about testing the limits of translating poetry and were pleasantly surprised with the results!

We are going to present a paper on poetry translation at the EMNLP conference this year. In this paper, we investigate the purely technical challenges around generating translations with fixed rhyme and meter schemes.

The value of preserving meter and rhyme in poetic translation has been highly debated. Vladimir Nabokov famously claimed that, since it is impossible to preserve both the meaning and the form of the poem in translation, one must abandon the form altogether. Another authority (and for us computer scientists, perhaps the more familiar one), Douglas Hofstadter argues that preserving the form is very important to maintaining the feeling and the sound of a poem. It is in this spirit that we decided to experiment with translating not only poetic meaning, but form as well.

A Statistical Machine Translation system, like Google Translate, typically performs translations by searching through a multitude of possible translations, guided by a statistical model of accuracy. However, to translate poetry, we not only considered translation accuracy, but meter and rhyming schemes as well. In our paper we describe in more detail how we altered our translation model, but in general we chose to sacrifice a little of the translation’s accuracy to get the poetic form right.

As a pleasant side-effect, the system is also able to translate anything into poetry, allowing us to specify the genre (say, limericks or haikus), or letting the system pick the one it thinks fits best. At the moment, the system is too slow to be made publicly accessible, but we thought we’d share some excerpts:

A stanza from Essai monographique sur les Dianthus des Pyrénées françaises by Edouard Timbal-Lagrave and Eugène Bucquoy, translated to English as a pair of couplets in iambic tetrameter:
So here's the dear child under land,
will not reflect her beauty and
besides the Great, no alter dark,
the pure ray, fronts elected mark.

Voltaire’s La Henriade, translated as a couplet in dactylic tetrameter:
These words compassion forced the small to lift her head
gently and tell him to whisper: “I'm not dead."

Le Miroir des simples âmes, an Old French poem by Marguerite Porete, translated to Modern French by M. de Corberon, and then to haiku by us:
“Well, gentle soul”, said
Love, “say whatever you please,
for I want to hear.”

More examples and technical details can be found in our research paper (as well as clever commentary).
Email ThisBlogThis!Share to XShare to Facebook
Posted in | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • New research from Google shows that 88% of the traffic generated by mobile search ads is not replaced by traffic originating from mobile organic search
    Posted by Shaun Lysen, Statistician at Google Often times people are presented with two choices after making a search on their devices - the...
  • Education Awards on Google App Engine
    Posted by Andrea Held, Google University Relations Cross-posted with Google Developers Blog Last year we invited proposals for innovative p...
  • More researchers dive into the digital humanities
    Posted by Jon Orwant, Engineering Manager for Google Books When we started Google Book Search back in 2004, we were driven by the desire to...
  • Google, the World Wide Web and WWW conference: years of progress, prosperity and innovation
    Posted by Prabhakar Raghavan, Vice President of Engineering More than forty members of Google’s technical staff gathered in Lyon, France i...
  • Query Language Modeling for Voice Search
    Posted by Ciprian Chelba, Research Scientist About three years ago we set a goal to enable speaking to the Google Search engine on smart-pho...
  • Announcing our Q4 Research Awards
    Posted by Maggie Johnson, Director of Education & University Relations and Jeff Walz, Head of University Relations We do a significant a...
  • Word of Mouth: Introducing Voice Search for Indonesian, Malaysian and Latin American Spanish
    Posted by Linne Ha, International Program Manager Read more about the launch of Voice Search in Latin American Spanish on the Google América...
  • Under the Hood of App Inventor for Android
    Posted by Bill Magnuson, Hal Abelson, and Mark Friedman We recently announced our App Inventor for Android project on the Google Research B...
  • Make Your Websites More Accessible to More Users with Introduction to Web Accessibility
    Eve Andersson, Manager, Accessibility Engineering Cross-posted with  Google Developer's Blog You work hard to build clean, intuitive web...
  • 11 Billion Clues in 800 Million Documents: A Web Research Corpus Annotated with Freebase Concepts
    Posted by Dave Orr, Amar Subramanya, Evgeniy Gabrilovich, and Michael Ringgaard, Google Research “I assume that by knowing the truth you mea...

Categories

  • accessibility
  • ACL
  • ACM
  • Acoustic Modeling
  • ads
  • adsense
  • adwords
  • Africa
  • Android
  • API
  • App Engine
  • App Inventor
  • Audio
  • Awards
  • Cantonese
  • China
  • Computer Science
  • conference
  • conferences
  • correlate
  • crowd-sourcing
  • CVPR
  • datasets
  • Deep Learning
  • distributed systems
  • Earth Engine
  • economics
  • Education
  • Electronic Commerce and Algorithms
  • EMEA
  • EMNLP
  • entities
  • Exacycle
  • Faculty Institute
  • Faculty Summit
  • Fusion Tables
  • gamification
  • Google Books
  • Google+
  • Government
  • grants
  • HCI
  • Image Annotation
  • Information Retrieval
  • internationalization
  • Interspeech
  • jsm
  • jsm2011
  • K-12
  • Korean
  • Labs
  • localization
  • Machine Hearing
  • Machine Learning
  • Machine Translation
  • MapReduce
  • market algorithms
  • Market Research
  • ML
  • MOOC
  • NAACL
  • Natural Language Processing
  • Networks
  • Ngram
  • NIPS
  • NLP
  • open source
  • operating systems
  • osdi
  • osdi10
  • patents
  • ph.d. fellowship
  • PiLab
  • Policy
  • Public Data Explorer
  • publication
  • Publications
  • renewable energy
  • Research Awards
  • resource optimization
  • Search
  • search ads
  • Security and Privacy
  • SIGMOD
  • Site Reliability Engineering
  • Speech
  • statistics
  • Structured Data
  • Systems
  • Translate
  • trends
  • TV
  • UI
  • University Relations
  • UNIX
  • User Experience
  • video
  • Vision Research
  • Visiting Faculty
  • Visualization
  • Voice Search
  • Wiki
  • wikipedia
  • WWW
  • YouTube

Blog Archive

  • ►  2013 (51)
    • ►  December (3)
    • ►  November (9)
    • ►  October (2)
    • ►  September (5)
    • ►  August (2)
    • ►  July (6)
    • ►  June (7)
    • ►  May (5)
    • ►  April (3)
    • ►  March (4)
    • ►  February (4)
    • ►  January (1)
  • ►  2012 (59)
    • ►  December (4)
    • ►  October (4)
    • ►  September (3)
    • ►  August (9)
    • ►  July (9)
    • ►  June (7)
    • ►  May (7)
    • ►  April (2)
    • ►  March (7)
    • ►  February (3)
    • ►  January (4)
  • ►  2011 (51)
    • ►  December (5)
    • ►  November (2)
    • ►  September (3)
    • ►  August (4)
    • ►  July (9)
    • ►  June (6)
    • ►  May (4)
    • ►  April (4)
    • ►  March (5)
    • ►  February (5)
    • ►  January (4)
  • ▼  2010 (44)
    • ►  December (7)
    • ►  November (2)
    • ▼  October (9)
      • Exploring Computational Thinking
      • Google at the Conference on Empirical Methods in N...
      • Kuzman Ganchev Receives Presidential Award from th...
      • Korean Voice Input -- Have you Dictated your E-Mai...
      • Clustering Related Queries Based on User Intent
      • Google at USENIX Symposium on Operating Systems De...
      • Making an Impact on a Thriving Speech Research Com...
      • Bowls and Learning
      • Poetic Machine Translation
    • ►  September (7)
    • ►  August (2)
    • ►  July (7)
    • ►  June (3)
    • ►  May (2)
    • ►  April (1)
    • ►  March (1)
    • ►  February (1)
    • ►  January (2)
  • ►  2009 (44)
    • ►  December (8)
    • ►  November (4)
    • ►  August (4)
    • ►  July (5)
    • ►  June (5)
    • ►  May (4)
    • ►  April (6)
    • ►  March (3)
    • ►  February (1)
    • ►  January (4)
  • ►  2008 (11)
    • ►  December (1)
    • ►  November (1)
    • ►  October (1)
    • ►  September (1)
    • ►  July (1)
    • ►  May (3)
    • ►  April (1)
    • ►  March (1)
    • ►  February (1)
  • ►  2007 (9)
    • ►  October (1)
    • ►  September (2)
    • ►  August (1)
    • ►  July (1)
    • ►  June (2)
    • ►  February (2)
  • ►  2006 (15)
    • ►  December (1)
    • ►  November (1)
    • ►  September (1)
    • ►  August (1)
    • ►  July (1)
    • ►  June (2)
    • ►  April (3)
    • ►  March (4)
    • ►  February (1)
Powered by Blogger.

About Me

Unknown
View my complete profile