AMT Lab @ CMU

View Original

Chasing Waterfalls: An Early Case Where AI Writes and Performs in an Opera

By: Jenna Shore

introduction

Many performing art lovers believe that the industry is safe from AI taking over the stage, but that hope is changing quicker than many expected. In fact, performing arts productions have been incorporating AI for several years as experiments, and these will only expand as more advanced AI is available. Every field will have outliers who are experimenting with greater or lesser success. Yet, it is rare to find productions creating AI characters, particularly in the field of opera. The following case study analyzes an opera, Chasing Waterfalls. The opera, which premiered in Dresden, Germany on September 3, 2022, challenged traditional opera expectations by bringing a different way to integrate technology into the opera space. Chasing Waterfalls utilizes Artificial Intelligence (AI) software to write, compose, and perform its own aria live within the show.

Image: Chasing Waterfalls - An Artificial Intelligence Opera

Image Source: Classical.NEXT

ai moves into the opera space

The concept of this opera began with Sven Sören Beyer, a German cross-media enthusiast whose “fascination for high-end technical developments and his mesmerizing combination of performing arts and media art lead him…across the globe.” Beginning this idea in 2019, he proposed his concept to the Semperoper Dresden, a long-standing historical opera house known for producing the best in world-class opera productions for the past 400 years. This concept would allow Semperoper Dresden to continue being a world-class leader in opera while working towards what a future in opera may look like with the incorporation of AI roles.

When creating the AI, the researchers at T-Systems Multimedia Systems (MMS) had to convert text-to-speech systems into a “singing voice synthesis system.” During this process, extensive codes were written in attempts to make a computer create not just speech but pitched sounds to imitate singing. The MMS team decided to try a different approach, they invited the lead of Chasing Waterfalls, Norwegian soprano Eir Inderhaug, to come to their studio in Berlin, Germany to record her voice. She initially sang 50 children’s songs that the team digitized, but this wasn’t enough. The AI learned how to imitate children’s songs, but that wouldn’t be enough to teach it to sing opera, even if it was imitating an opera singer. During Inderhaug’s second visit to the studio, she sang 20 opera arias which were digitized for the AI to use machine learning for imitation. The 70 total songs provided a data source that the MMS team was able to feed into the AI to teach is how singing functions. After a lot of trial and error, the AI proved itself ready, even after doubts from members of the creative team. Utilizing GPT-3, the AI was able to develop its own text in the moment for its aria. The AI was given the following instruction: “Write an opera aria in which you reflect on yourself. You are allowed to be cynical and humorous.” Each performance is different; the AI creates new text and sings a new melody with new background music. While similar phrases are used because of what the machine learns during the show, each show is slightly different, just as when humans perform.

humans asked, ai listened

Many questions about the process and final production arise: How does the AI know what form to create? How was the AI created on the stage? Did they build a robot? Was it a hologram? Sven Sören Beyer, the man behind the conception of Chasing Waterfalls, finds “the interplay between people and machines fascinating,” and therefore creates an eight-meter kinetic light sculpture operated by the AI to create its own form.

Perhaps even more unique to this production, audience members had the opportunity to have their faces go through a 3D scanner that the AI would use to create itself in this scene. Not only did this provide the information necessary for the AI to produce itself for this pivotal moment in the story, but audience members were then able to see parts of themselves within the creation, making the story that much more realistic for each audience member who participated in the face scan.

With the background about the technology used in Chasing Waterfalls somewhat clear, the plot makes more sense. In this opera, the six vocalists, nine instrumentalists (scores composed by Angus Lee, a Hong Kong composer), and the creative studio Kling Klang Klong producing the digitized music, “the artist collective embarks on a music-theatrical journey about the impact that AI and social media have on our lives.” Throughout the opera, the vocalists and instrumentalists musically and scenically interact with one another to endeavor aspects of identity in the digital age. The main character, I, played by Eir Inderhaug, begins her day as any other by logging into her computer, except her face is unrecognized when scanned and the AI says, “You are not yourself.” Throughout the show, she is forced to encounter her digital twins, or virtual egos, which make her recognize and question herself. Who is she really? Ultimately, what has the capacity to decide who – or what – is human?

Image: Chasing Waterfalls | AI Opera

Image Source: Klook

the characters in the opera

The protagonist has five digital twins in this show, played by other humans, which represent “the longing for success (Sebastian Wartig), the nagging doubt (Simeon Esper), the promise of happiness (Julia Mintzer), the deceptive appearance (Jessica Harper) and the playfully researching child (Tania Lorenzo).” As the protagonist interacts with each of her digital twins, she is forced to confront herself, both as a human and as an online persona. This script is meant to reach audiences in a personal way. How does each person who watches this opera represent themselves as a human versus how they represent themselves online? Do they have different personas for different social media and/or gaming platforms? Do any of them align with who is the persona that is taken onwhen interacting face-to-face? These philosophical questions are what the opera is about. Of course, it was also to try creating an AI that can create and sing opera, but there had to be more than just an experiment, and the plot pushed contemporary issues concerning social media and the digital world to the forefront.

the outcome and the documentation of chasing waterfalls

There are very limited resources with videos and/or images of Chasing Waterfalls. The only videos publicly available are concerning the rehearsal trailer promoting the opera. However, there are mp3 files available of the AI as it learns to sing. From the AI’s first attempt, it does not sound promising, but the AI’s machine learning capabilities bring it closer and closer to sounding like Eir Inderhaug’s singing voice. There are still errors with the AI’s singing, such as not having clear consonant sounds to distinguish words or sounds of static during the singing, but it comes surprisingly close to what is desired for an operatic sound. The first few attempts of the AI singing are part of a Spotify playlist called SYS-Talk (Engl.), which is run by T-Systems.

There are additional recordings that Nico Westerbeck listed in his article describing many of the technical challenges faced when writing the AI into existence. He was one of the leading T- Systems members tasked withhelping to create the world’s first singing AI. The following video and links are for the opera’s rehearsal trailer and audio files of the AI as it learns to sing.

Chasing Waterfalls had some interesting reviews at its debut. Surprising for some, they were often quite positive. As one opera critic, André Sperber, wrote,

“The experiment ‘chasing waterfalls’ was ultimately successful in many respects. On the one hand, on the content level, because it holds up a mirror to our society and raises important questions: Who are weactually? Who are we in real life and who are we in the virtual world? And who decides on whom? On the other hand, on the technical level, because it incorporates novel elements with the use of AI, shows possibilities and limits of innovative technology and further advances the discourse on the topic of art and artificial intelligence.”

Image: World’s first ‘artificial intelligence article’ review

Image Source: South China Morning Post

This quote shows that those involved with the production of Chasing Waterfalls are not the only ones who believe strongly in the use of AI in art, but also understand that it has limitations since it is not a creative beingitself without being given commands from a creative to pull information to “create.” Sarah Schmitt, a mathematician, programmer, and part-time philosopher who focuses on ethical and societal future issues of AI, also left a review as someone who is not looking through the lens of opera, but rather the lens of humanity. “The piece [is] sometimes alienating, but at the same time attractive and sympathetic…After all, it is precisely the idiosyncrasies that do not follow any logical pattern that make us human and therefore cannot be replicated by machines.”

conclusion

Much more remains to be experimented with and tried as far as artificial intelligence and the performing arts are concerned. But perhaps we have not tried mixing the two until now because of the fear that comes from AI taking over in some sense. Through Chasing Waterfalls, it is clear that AI can replicate human vocals, even if it is not as beautiful or accurate as the human would sing. Perhaps those of us who enjoy the humanity of the arts are afraid of what it may become once AI is capable of creating without explicit directions from a human behind the code. This opera was developed and produced within three years, so how much time is left before every modern opera incorporates AI performers? How long until modern operas only incorporate AI performers? Will it still be considered opera? Keeping the “human” in humanity may become more difficult as technology evolves and develops further with machine learning. But maybe the performing arts will be safer than other careers because nothing is quite like the energy felt with live performers on stage.