Feed a prompt to an AI image generator and you’re bound to encounter an insidious pattern: Do the people look … too stunning? Perhaps even wanton?
Gender, race, body type, nationality, religion—you’re almost guaranteed to get prejudiced and outdated stereotypes when using these descriptors in prompts. And “wanton” is a deliberate adjective; it’s mostly used pejoratively toward women, and AI tends to oversexualize female images. These glaring imbalances showcase a recurring problem with AI outputs: the replication of societal biases, which can be harmful to actual people and communities.
I wrestled with this firsthand while helping develop Sir Martian, one of our key AI demos featured at Cannes earlier this year. Sir Martian, playfully named after Sir Martin Sorrell, is an AI-powered robot in the form of an alien caricaturist. Throughout the festival, he invited attendees to sit down for a quick chat and a sketched portrait, based on their appearance and tastes.
I’m proud that the demo was a success, because as you can imagine, this interaction was more than a simple conversation. And it taught me a lot about the privileges and responsibilities of shaping a new technology. Here’s what I learned.
Words matter—your data sets the tone
Most AI tools available for the general public are trained on datasets that aren’t accessible or visible to users, so I feel particularly fortunate to work at a company that creates and trains its own models. It really is a “great power, great responsibility” scenario.
The foundation of any generative AI model should be diverse and comprehensive. By expanding the range of base images and training materials, developers can create AI systems that represent a broader spectrum of human experiences. This enriches outputs and helps combat entrenched biases.
With Sir Martian, specificity was essential for aligning user inputs with desired outputs. After some trial and error, we found that we had to train the model combining visual input with very precise text prompts in order to get it to represent people accurately.
When given a picture of a Black woman and the prompt “woman with braids,” the AI model automatically defaulted to a woman with German-style braids. We had to train and fine-tune it using specific terms like “cornrows” and “box braids” to get it to create accurate drawings. Giving the system a wider variety of terms to connect to visual references was crucial to getting more diverse depictions.