How AI is Transforming Music Creation in Web3

If the viral Drake deep-fake brought you to this article, welcome to the fascinating (and admittedly spooky) side of how the Web3 music world is embracing artificial intelligence.

AccessTimeIconJun 6, 2023 at 5:25 p.m. UTC
Updated Jul 24, 2023 at 9:28 p.m. UTC
10 Years of Decentralizing the Future
May 29-31, 2024 - Austin, TexasThe biggest and most established global hub for everything crypto, blockchain and Web3.Register Now

In April 2023, Warner Music Group chief digital officer and executive vice president of business development Oana Ruxandra told CoinDesk’s The Hash that she expects music tools driven by artificial intelligence (AI) to “open up the world like we haven’t before,” inspiring “new forms of creativity and sub-genres” across the music and entertainment industries.

While Ruxandra’s outlook is optimistic, she also acknowledged the concerns of many musicians: “We have to be very vigilant,” she said, noting the importance of protecting the creativity and rights of artists. Just days before Ruxandra’s appearance on The Hash, an AI-generated deep-fake music track titled Heart On My Sleeve gained traction by mimicking the voices of songwriters Drake and the Weeknd – even though neither artist had participated in its creation. Instead, the song’s creators trained the artificial intelligence bot using music by the artists, which angered label owner Universal Music Group over.

Other musicians have been more welcoming to the new technology. Less than a week later, electropop musician Grimes invited her fans to create their own AI-dubbed songs using her voice and extended the offer to split royalties 50/50, demonstrating one creative solution to the AI deep-fake conundrum.

Keeping intellectual property challenges in mind, there’s still no doubt that AI music tools can place new forms of expression at an artist’s fingertips. Sometimes, AI can even be used to enhance music production by filling in technical or intellectual gaps in an artist’s abilities, helping them bring ambitious concepts to life in a matter of clicks. These tools can also perform sound engineering tasks more efficiently, lowering barriers and the time it takes to release music.

As we look toward Web3, companies and artists are taking AI even further by pairing music with immersive, interactive and user-generated experiences in the metaverse and beyond.

AI music tools in Web3

A number of crypto-native musicians and platforms have already found creative ways to integrate AI tools into their practice.

Take VNCCII, for instance, the metaverse-first alter-ego of Sydney-based female producer Samantha Tauber. Utilizing the industry-leading real-time 3D creation tool, Unreal Engine, Tauber dons her avatar to stream live broadcast interviews from the metaverse, in addition to performing in virtual concerts and shows. Like any set or costume change, the digital component of VNCCII’s artistic identity is expanding the borders of her artistry.

Web3 music company PIXELNYX combines augmented reality (AR) experiences with metaverse gaming and is focused on helping artists build memorable experiences for fans. Co-founded by the electronic music producer Deadmau5, who has been known for sending fans on quests through The Sandbox and hosting shows in Decentraland, PIXELNYX aims to evolve our traditional notions of fandom through the use of AI, Web3 and user-generated content (UGC).

In April, PIXELNYX released Korus, a tool that allows users to create AI-powered music companions using officially-licensed artist content.

When used in this spirit, AI music tools can aid, augment or enhance an artist’s creative style. While the tools are not good enough yet to replace artists, they are impressive and constantly “learning” through continued human interaction. Replacing musicians with AI has never been a popular take, as proven by the pushback Spotify received after testing its own version of artificial music curation. Yet despite the controversy surrounding AI, today’s musical artists may be able to benefit from using AI-assisted music production in ways that respect the craft.

Ideation and collaboration

WarpSound, an adaptive AI music platform, has found several ways to integrate blockchain-based collectibles and digital avatars into its business offerings. The company, which produces music content, non-fungible tokens (NFTs) and social experiences, is soon releasing a software API that composes original music note-by-note in a range of styles.

Founder and CEO Chris McGarry, an entrepreneur and media executive who previously severed as the music lead at Facebook’s virtual reality unit Oculus, says WarpSound’s tools help artists find new inspiration and source material that invigorates their creative processes. The company is the recipient of The Sandbox’s Game Maker Fund, which supports game designers in The Sandbox metaverse, and plans to build a home venue inside the platform where artists can experiment with generative music.

WarpSound also worked with Mastercard as the AI music partner for their Artist Accelerator program, where McGarry says he’s observed new benefits to the creative process.

“Last week, I was in a set of virtual studio sessions with artists participating in the program,” said McGarry. “We were working with our generative AI music interface to present a set of musical ideas, then having the artist shape those and iterate until they landed on something, the essence of which resonated with them, that they were motivated to work with.”

WarpSound has also partnered with the Tribeca Film Festival and YouTube to create interactive and playful music experiences between artists and audiences.

Composing and arrangement

If your music project is less about live performance and more about the finished product — maybe you’re composing original music for a podcast, metaverse event, YouTube channel, Web3 video game or educational content — you can use AI to speed up the process of composition and arrangement. Of course, the world’s most talented virtuosos can likely do musical scales in their sleep, but with so many elements to sound and video production, it’s becoming standard practice to use AI to insert quick scales, arpeggios, runs and harmonies to original music.

Tools like Riffusion allow users to provide text prompts that are transformed into music. Soundful is another AI platform that allows people to generate and download royalty-free tracks.

If you want to go one step further and add lyrics, the popular do-it-all tool ChatGPT can write a two-verse song with a pre-chorus, chorus, bridge and outro in just under 30 seconds with minimal prompting. Of course, the lyrics may be a touch simplistic or cheesy — but aren’t some of the best songs?

In most cases, songs generated by AI are reproducible without the need to pay licensing fees, given they were made by machines and therefore not protected under U.S. intellectual property law. Most platforms, however, charge a subscription fee.

These sounds can then be minted as NFTs and sold on marketplaces like OpenSea. Platforms like Royal.io also allow artists to join the site and offer their songs as fractionalized NFTs that offer royalty payouts to fans.

The limits of AI music production

You may have already heard that musical AI tools aren’t yet that sophisticated, especially when compared to the latest AI text-to-image generators (which have already been used to spin out whole comic book collections) and Open AI’s chatbot, Chat GPT (which reportedly passed the Bar exam).

Audio production indeed requires more computing power than static text and image outputs and therefore is lagging behind, according to experts in the field. Alexander Flores, head of tech and strategy at the music research network, Water & Music, says that tech innovation generally travels from the least data-intensive formats to the richest. In the case of AI, it makes sense why chatbots are perhaps faster to develop than AI audio and video rendering.

In one online discussion thread, a Reddit user pointed out these limitations, emphasizing that while a writer can proofread and edit an AI chatbot’s outputs in seconds, it takes several minutes to listen to a song, and sometimes even hours to edit it. Machines are also slower to learn from AI datasets since audio files that feed them rarely have comprehensive text descriptions to teach the AI about the file’s attributes (genre, tempo, key, instrumentation, etc.). Meanwhile, text and image-based AIs can swiftly trawl through thousands of words and visuals.

“How long it takes to consume the content matters a lot,” said Flores. “With a song, you're locked in for three minutes. You can't speed it up because then you're not experiencing the actual song as it was written.”

In addition, images are static, while songs are more dynamic: “Audio is just much higher dimensional,” said Stefan Lattner, managing researcher at Sony CSL, a creative technology lab, in a panel at Water & Music’s inaugural Wavelengths Summit. “While images have a fixed number of pixels, in audio you have a variable number of seconds that you want to generate.”

Nonetheless, Water & Music calls creative AI the most disruptive technology for the music business since Napster, the peer-to-peer file-sharing application that made music distribution virtually free, as well as borderless and permissionless – a concept familiar to crypto-natives.

Edited by Rosie Perper.

Disclosure

Please note that our privacy policy, terms of use, cookies, and do not sell my personal information has been updated.

CoinDesk is an award-winning media outlet that covers the cryptocurrency industry. Its journalists abide by a strict set of editorial policies. In November 2023, CoinDesk was acquired by the Bullish group, owner of Bullish, a regulated, digital assets exchange. The Bullish group is majority-owned by Block.one; both companies have interests in a variety of blockchain and digital asset businesses and significant holdings of digital assets, including bitcoin. CoinDesk operates as an independent subsidiary with an editorial committee to protect journalistic independence. CoinDesk employees, including journalists, may receive options in the Bullish group as part of their compensation.

Megan  DeMatteo

Megan DeMatteo is a service journalist currently based in New York City. In 2020, she helped launch CNBC Select, and she now writes for publications like CoinDesk, NextAdvisor, MoneyMade, and others. She is a contributing writer for CoinDesk’s Crypto for Advisors newsletter.


Learn more about Consensus 2024, CoinDesk's longest-running and most influential event that brings together all sides of crypto, blockchain and Web3. Head to consensus.coindesk.com to register and buy your pass now.