13/09/2024
In the second part of our exploration into the world of AI content creation, we delve into the new frontier of AI image generation (see Part 1 on AI copywriting here). Tools like Midjourney, Stable Diffusion, and Adobe Firefly have revolutionised how we create visual content. They open up endless and rapid creative possibilities, with exponential leaps in quality and accuracy from month to month. However, as with all technological innovations that have come before, Generative AI also brings with it many ethical considerations, and perhaps even a certain degree of existential dread.
The Rapid Evolution of AI Image Generation
AI image generation has come a long way in a very short time. Tools like Midjourney and Stable Diffusion offer users the ability to create stunning images from textual prompts. These tools have become incredibly sophisticated, allowing for the creation of realistic and detailed visuals that were once the domain of skilled artists and photographers. The speed at which these tools are evolving is astonishing: within two years we’ve gone from child-like attempts at painting to entirely photorealistic images that are hard to distinguish from reality.
As an example, here’s a collage of what each successive version of Midjourney was able to generate, from version 1 to 6, for the prompt “A church in the English countryside at sunset”:
Midjourney V1 - the top left image, was released February 2022, and V6 in December 2023.
Such rapid evolution is being matched by widespread adoption. Machine learning is being implemented at all levels of society at a much, much faster rate of intake than such innovations as mobile phones or the world wide web. From school children using ChatGPT to write their essays, to advertising agencies generating endless bespoke imagery for targeted social media, to Hollywood implementing AI in its VFX workflows.
The rapidity with which AI is becoming a part of our lives can be daunting, but the solution is simply to try our best to keep up with the advances and learn as much as we can. Above all it is to understand its advantages as well as its shortcomings. One advantage is that generative AI is a useful brainstorming tool, which can greatly and rapidly expand the creative possibilities at hand. We are effectively trading paint brushes for the art of “prompting” - translating imagery into careful instruction.
Getting Accurate Results: The Art of Prompting
One of the keys to leveraging these AI tools effectively is understanding how to craft your prompts. The more specific you are, the better the results. Here are some tips to get the most out of your AI-generated images:
Composition: Specify the arrangement of elements in your image. Use terms like "rule of thirds," "central composition," or "asymmetrical balance" to guide the AI.
Lighting: Describe the lighting conditions you want. Words like "soft lighting," "dramatic shadows," or "golden hour" can significantly affect the image's mood.
Style: Indicate the style you are aiming for. This could be "impressionist," "photorealistic," or "minimalist."
Camera and Lens: To generate photorealistic imagery, mention the type of camera and lens you want the AI to emulate. Terms like "wide-angle lens," "telephoto lens," or “Kodak Portra 400" to emulate film stock for warm tones or "Fuji Velvia" for vibrant colors.
By using specific terminology, you can guide the AI to produce images that closely match your vision. This also means that the best results are generated by experts in their fields. The most photorealistic AI images are generated ironically by photographers who know exactly what terminology will generate the sort of images in their minds. As an avid photographer myself I know exactly what sort of camera and lens combination will generate specific looks. For example, the image below is the result of me prompting for a “Portrait of a young woman with blue eyes and ginger hair facing us in a Parisian bar. Photoreal, GFX100s, Canon 50mm f1.2 LTM”
In this next prompt, I generated a variety of clergymen in various London based settings.
A reminder: none of these people are real, nor are they based on any existing people… however, given that the AI is basically analysing similar styles of portraiture from throughout the internet, whether it could be accused of plagiarism is a much murkier topic…
Ethical Concerns and Plagiarism
The way machine learning works is that the AI is trained on vast amounts of data to recognise the needs of a prompt. Where this data is sourced from has varied over time and between the different Generative AI providers. Recently for example, Adobe faced huge backlash when its updated terms and conditions suggested all the work generated by users on software such as Photoshop, Lightroom or Illustrator, could then be used by Adobe to train its AI models.
Following huge outcry, they were forced to make it clear this would not be so. However, it’s certain that tools like Midjourney are definitely drawing their knowledgebase from pretty much everywhere. And the best results are often generated when one specifically asks it to create art “In the style of (insert famous artist name)”.
For example: “The Mona Lisa wearing Aviator sunglasses”:
When an AI imitates the style of a specific artist too closely, it can blur the lines between inspiration and copying. While the generated content is technically new, it can still evoke the style of a known artist far too closely, leading to accusations of plagiarism. The counter argument is that all art is inspired by what came before.
But there’s a huge moral difference between learning to draw or paint by copying work from artists you admire and an AI producing near identical work in seconds. Not least because in the age where content is king and massive multinational corporations simply want to create content as quickly and as cheaply as possible, it makes being an artist even harder than it was before.
When advertising agencies are faced with having to pay an artist a day rate to produce work that will take weeks, or simply writing a prompt to get endless content in seconds for free, it is clear that it becomes a threat to artists everywhere. The moral dilemma is clear.
Churches need to be mindful of this when using AI-generated images. It’s important to strike a balance between drawing inspiration from various sources and creating something that is genuinely unique. Additionally, knowing when to use AI for convenience versus asking a human being to do the work, even if they cost more time or money. Giving credit where it's due and being transparent about the use of AI in your content can help mitigate these concerns.
As AI-generated imagery becomes more realistic, it also becomes harder to distinguish from real images. This poses moral and safety concerns. For instance, realistic AI images can be used to spread misinformation or create deepfakes (images or videos that convincingly mimic a real person using digital manipulation), which can undermine trust and cause harm.
Recognising AI Imagery
To avoid being misled by AI-generated images, here are some tips to recognise them:
Look for Inconsistencies: AI-generated images often have subtle inconsistencies or ‘artifacts’, such as unnatural textures or lighting.
Examine the Details: Pay attention to small details. AI can struggle with complex patterns and realistic human features.
Use Detection Tools: There are tools available that can help detect AI-generated images. Stay updated on the latest technology to aid in verification.
In particular, hands still seem to be a human element that generative AI struggles with and are often a good indicator to look for. Here are two failed examples of my attempt to generate “hands in prayer”:
In conclusion, AI image generation is a rapidly evolving field with immense potential. By understanding its capabilities and limitations, and by using it ethically and responsibly, churches can enhance their content creation and engage their communities in new and exciting ways.
However, it is also a pandora’s box of ethical concerns. If we move into a world where the impetus to create art from scratch gives way to generative art, does it not devalue the meaning of our existence? Are these AI models simply making a mockery of what it means to be human? I enjoy creating art as a means to express myself and I earn a living largely to allow myself to keep on creating art. But in a world where content is king and where the priority for businesses is to create content as rapidly and cheaply as possible, where does it leave artists? And is it not soul-destroying to know that years of learning and honing their craft becomes meaningless when AI can generate the same thing in seconds?
The other issue is how will we recognise what is real and what is not? Particularly when this technology can now also be applied to video…
Stay tuned for our next blog where we explore the frontier of AI video generation and its implications.
- Chris Rowe, Filmmaker and former Lead Content Producer on the CofE Digital Team