The Truth About Artificial Intelligence and Creativity
Artificial intelligence allows creators to be creative, but even sophisticated AIs are really just an advanced form of copying, says David Z. Morris. This feature is part of CoinDesk's Culture Week.
The past year has been a very successful coming-out party for advanced versions of two types of artificial intelligence. We’ve been blown away by large-language models, or LLM, that can mimic a friendly conversation or a high schoolers’ history homework. And we’ve had our minds blown by art processors including Midjourney and Stable Diffusion.
The models are fun toys, and some may become reliable enough to have serious commercial applications. But LLMs and image processors face a major, even defining limit: Both LLMs and their graphical equivalents operate entirely by copying existing work.
This feature is part of CoinDesk's Culture Week.
That has quickly led to lawsuits against AI firms that have trained their models without paying to properly license training images or other data. Those suits are an ironic followup to the wave of intellectual property (IP) lockdowns of the 1990s, which began with tougher controls on music sampling and culminated in 1998’s Digital Millennium Copyright Act (DMCA). Those rules were pushed by the same kind of huge conglomerates that are now creating AIs – and fighting for their right to, at least according to critics, steal whatever they want.
Artists feel rightly threatened by the exploitative drive of the Googles and Microsofts of the world – but they also see promise in AI. Producer and vocalist Holly Herndon is among those who view AI as a potential tool for her creativity rather than strictly a threat to her ability to make a living from it. But that future, friendly to both artists and AI, can only exist under the right regime of IP laws and practices. Early efforts to develop such a framework include experiments with non-fungible tokens (NFT) and other blockchain tools.
The claim that AI models are effectively doing nothing but copying will surprise many, especially those who’ve enjoyed their often uncannily good output, but don’t know how they actually work.
The output is so impressive because what AIs do is an advanced, nuanced kind of copying. Rather than lifting specific phrases or images, they extract and regurgitate the underlying patterns of human expression. This is why some have called LLM AIs “autocomplete on steroids.”
“At a literal level, these neural nets are trained to do one thing and one thing alone, which is predict the next word,” says Beerud Sheth, founder of Elance (now Upwork) and now CEO of chatbot developer Gupshup. “The miracle is that autocomplete on steroids actually is a pretty good description of a lot of human language … It creates all sorts of amazing things.”
Those amazing results can mask the fundamental simplicity of what’s going on under the hood. While they’re really just juggling numbers and taking probabilistic guesses based on pre-existing work, chatbots and art AIs have shown they can mislead some into believing they’re creative, or even self-aware.
Large larceny models
The creators of AIs benefit from that misperception, and some have been subtly fostering it. Even the increasingly standard term “generative AI” implies more originality than these bots actually have. Such overselling helps foster unrealistic perceptions of what the technology is capable of – most importantly, among investors.
That’s not so new: Tech companies from Facebook to Uber have spent much of a decade promising AI would help them scale and reach profitability. Those long-ago promises remain entirely unfulfilled, and after less than a year the serious limitations of even the new wave of advanced language models have made themselves known.
But perhaps even more useful than misleading Big Tech investors, the mystification of “expressive” AI helps distract from how it often works: via copyright infringement on a massive scale.
“Everything an LLM knows comes from copyrighted expression,” argues intellectual property lawyer Matthew Butterick. “There’s no moment where it becomes a generator of underlying ideas.”
Butterick is pursuing what he says are the first two major lawsuits aimed at enshrining that argument in case law, and he expects similar cases to multiply in coming years. Such litigation could pose a major obstacle to the profitability of some AI applications, which might be forced to pay licensing fees for all of the millions of pieces of human-crafted expression used to train them.
Read more: Megan DeMatteo - How AI Is Changing Artistic Creation and Challenging IP Laws
“When you go down that path of arguing, ‘It’s some data derived from the image in a certain way,’ well, that’s just a copy,” says Butterick. “It’s just a fancy way of figuring out how to store the data and regenerate it later.” Renowned science journalist and sci-fi writer Ted Chiang recently made much the same point in the New Yorker magazine: Chatbots and other AI generators, as he put it, are best understood as “a blurry JPEG of the web.”
Butterick is hoping to make that view of chatbots and image generators part of legal doctrine. Butterick represents a trio of artists in a lawsuit as part of a class action suit filed in January against AI image generators Stability AI, Midjourney, and DeviantArt. The suit argues their AI image model was trained using unlicensed images. Getty Images has made similar claims in a subsequent infringement suit. Butterick is also leading a class-action suit against Copilot, a Microsoft-backed effort to use AI to craft computer code – which it accomplishes by nicking human-crafted code, in part from the open-source repository Github.
Copilot in particular, Butterick says, has been caught simply cutting-and-pasting large blocks of code from human developers into its supposedly “generative” output. It also offers its clients a half-million-dollar indemnification against copyright infringement suits for code they deploy, which strongly suggests Microsoft understands that it’s playing with fire.
All of this, Butterick argues, clearly violates well-established copyright standards. One major moment in the development of those standards revolved around an earlier technological innovation: digital sampling tools.
Many important legal precedents in IP stem from the use of samples in hip hop music, a practice that produced staggering creative innovation and masterworks like Public Enemy’s “It Takes a Nation of Millions to Hold Us Back” and the Beastie Boys/Dust Brothers’ “Paul’s Boutique.”
But when sample-based records started making money, record labels took artists to court. By the mid 1990s, they had built an unwieldy and expensive sample-clearance regime that effectively killed the kind of richly layered sample-based tracks that gave the practice its cultural currency, and economic value, in the first place.
One sterling example of what this restrictiveness costs everyone is “The Grey Album,” a circa 2004 mashup of Beatles and Jay-Z songs that launched the career of producer Danger Mouse. Though widely admired (and not THAT hard to find), it has never seen a commercial release, and probably never will. That doesn’t just make it harder for fans to listen – Danger Mouse, Jay-Z and the Beatles are all missing out on revenue, too.
Now, Butterick says, many of the same kinds of large IP holders that fought for these restrictive controls want to have their cake and eat it too.
“Microsoft [is] one of the biggest owners of IP in America. And now they’re like, we want access to all the code, forever, for free. The irony: You could be digging for days.”
Artists in control
“I think we get to redo how IP works,” says music producer and cybertheorist Mat Dryhurst. “Some artists will be more permissive, some more protective, and that is fine. Ultimately it should be up to the artist.”
Dryhurst is part of the creative team behind producer, singer (and Stanford University PhD) Holly Herndon. While a generalized processing AI can rip off any artists’ style for free, Herndon and her team are trying to beat the deepfakers at their own game by creating their own AI copycat, Holly+. Holly+ is an AI filter trained on Herndon’s own voice. It can be used by other singers who want to sound uncannily like Herndon. It can also be used to create tracks in Herndon’s style from scratch, as with a recent cover of Dolly Parton’s classic “Jolene.”
Dryhurst and Herndon are also part of the team at Spawning.ai, an organization building “AI tools for artists, by artists.” Their first project, now in beta, is haveibeentrained.com, which aims to help artists control their work’s use by large AI models.
Read more: Caitlin Burns - AI and Crypto Are Combining to Create Web3's 'Multiplayer Era'
But the goal is definitely not to stop that use altogether. Instead, Dryhurst, in an email exchange, wrote that he wants every artist to create their own AI, rather than having everyone’s work dumped into a single uniform model.
“My ideal would be for the base substrate large models to be stripped of as much data as possible that may encroach on an artist’s ability to make their data economically productive,” he says, “And to move towards … a relatively clean base model that supports more specific models that are owned and controlled by artists themselves.”
If artists can build and control their own models, they can also both control collaboration via, and make money from, those models. By contrast, big AI organizations such as Microsoft and OpenAI are in essence pushing for the right to ingest all cultural output into large, centralized models that artists don’t have any control over at all.
The Herndon team has used Holly+ as a testing ground for that improved AI future. Herndon exerts creative and a degree of financial control over who can use the Holly+ models, including screening recordings made with Holly+ before they approve public releases (though Dryhurst says they approve almost anything that’s “kind of cool”).
They and others have also experimented with NFTs as a way to allow access or generate revenue for Holly+ productions. And while it would depend on changes to the broader IP legal regime, it’s not hard to imagine a future of decentralized IP management, including for AI training, that builds on blockchain tools.
With their particular balance of broad access and granular control, blockchains are if nothing else a better conceptual model of how artists would like to manage their rights than the big centralized models currently threatening to devour the world – and leave artists out in the cold.
Correction 3.20.20: Corrected details of a class action suit against Stability AI, Midjourney, and DeviantArt led by Michael Butterick.
The leader in news and information on cryptocurrency, digital assets and the future of money, CoinDesk is a media outlet that strives for the highest journalistic standards and abides by a strict set of editorial policies. CoinDesk is an independent operating subsidiary of Digital Currency Group, which invests in cryptocurrencies and blockchain startups. As part of their compensation, certain CoinDesk employees, including editorial employees, may receive exposure to DCG equity in the form of stock appreciation rights, which vest over a multi-year period. CoinDesk journalists are not allowed to purchase stock outright in DCG.
Learn more about Consensus 2023, CoinDesk’s longest-running and most influential event that brings together all sides of crypto, blockchain and Web3. Head to consensus.coindesk.com to register and buy your pass now.