OpenAI clearly knows what it’s doing. In late 2021, a research group of just a few people discussed an idea at OpenAI’s San Francisco office, and they subsequently created a new version of OpenAI’s text-to-image model: DALL-E.
This is an AI model that can turn short text descriptions into pictures, so you can have it generate a fox painted by Van Gogh, or a corgi made of pizza.
OpenAI co-founder and CEO Sam Altman told MIT Technology Review: “We will create something new, and then we all have to experience it for a while, almost always. We’re going to try to figure it out. What it will look like, what it will be used for.”
But not this time. When they modified the model, everyone involved realized that it was something special. “Obviously, this is our product,” Sam says. “There’s no debate [about it], we’ve never even had a meeting to discuss it.”
But no one could have predicted how big a splash the product would be. Sam said: “This is the first artificial intelligence technology that is brought to life by every ordinary user.”
DALL-E2 will be released in April 2022. In May, Google officially announced (but did not release) two text-to-image models of its own, Imagen and Parti.
Then there was Midjourney, a company that came out with a text-to-image model made for artists. In August, British startup StabilityAI released the open-source model StableDiffusion to the public for free.
Early adopters flocked to it. OpenAI has attracted 1 million users in just 2.5 months. More than 1 million people started using StableDiffusion through the paid service DreamStudio; many more people use StableDiffusion through third-party applications, or install the free version on their own computers.
Imad Mostak, the founder of StabilityAI, said his goal is to have 1 billion users.
In October 2022, we see a second wave: companies such as Google and Meta have released text-to-video models that can create short videos, animations and 3D images.
The speed of this development is astonishing. In just a few months, the technology hit the headlines and covers of magazines, and social media was full of people and topics about it. The topic remained hot but also sparked a backlash .
Mike Cook, an artificial intelligence researcher at King’s College London who studies computational creativity, said: “The technology is amazing, it’s fun, it’s what new technology should be.
But it’s developing so fast that your Understanding just can’t keep up with it. I think it will take a while for society to digest it.”
Artists were caught up in the greatest upheaval of our time. Some will lose their jobs; some will find new opportunities. Some have opted to pursue legal action because they believe the images used to train the models have been misused.
”For someone with technical training like me, it’s pretty scary,” said Don Allen Stevenson III, a digital artist who’s worked at visual effects studios like DreamWorks. “I’d
say God, it’s all my job,” he said, “and I had an existential crisis within the first month of using the DALL-E. ”
While some are still in shock, many, including Stevenson, are finding ways to use these tools and predict what will happen next. The exciting
truth is that we don’t know what’s next. ” What will happen. The reason is that while the creative industries from entertainment media to fashion, architecture, marketing etc. will feel the impact first, this technology will empower creativity to everyone. In the long run, it can be used
for Generating designs for almost anything, from new drugs to clothing and architecture. The generative revolution has begun.
For Chad Nelson, a digital creator who has worked on video games and TV shows, the text-to-image model was a once-in-a-lifetime breakthrough.
He said: “This technology allows you to turn a flash of light in your head into a prototype in seconds. The speed at which you can create and explore is revolutionary – surpassing any moment I’ve experienced in 30 years ’”
Within weeks of the mockups, people were using the tools to prototype and brainstorm everything from magazine illustrations and marketing layouts to video game environments and movie concepts.
People create fan works, even entire comic books, and keep sharing them online. Sam even used the DALL-E to design sneakers, and after he tweeted the design, someone made a pair just for him.
Tattoo artist and computer scientist Amy Smith has been using the DALL-E model to design tattoos. “You can sit down with the client and design together,” she said. “We’re in the midst of a revolution.”
Digital and video artist Paul Trillo believes the technology will enable brainstorming about visual effects. Easier and faster.
”People are saying it’s the end of special effects artists or fashion designers,” he said. “I don’t think it’s the end of any profession. Rather, I think it means we don’t have to work nights and weekends.”
Pictures took a different position. Getty has banned AI-generated imagery; Shutterstock has signed a deal with OpenAI to embed DALL-E on its website, and says it will create a fund to compensate artists whose work is used as training data for models.