OpenAI has made waves in the tech world again with its latest innovation: GPT-4 with Vision, or GPT-4V.
GPT-4 vision for writing code given a design: https://t.co/xJy9yFNvKG
— Greg Brockman (@gdb) September 27, 2023
GPT-4V builds on GPT-4 and incorporates visual capabilities, allowing the model to analyze images provided by ChatGPT Plus and Enterprise subscribers.
Rolling out to paid users over the next two weeks!
— Greg Brockman (@gdb) September 27, 2023
The new feature has great potential but also carries some risks for businesses.
GPT-4 With Vision Examples
As more users gain access to the new feature, they are sharing examples of how GPT-4 with Vision works.
GPT-4 with Vision can analyze handwriting.
Pretty cool. AI is better at deciphering handwriting than I am.
Prof. Breen asked if GPT-4 with vision can read Robert Boyle’s handwritten manuscript. It does well!
Likely going to be a big deal for a number of academic fields, especially as the AI can “reason” about the text. https://t.co/n9jUjqeEw3 pic.twitter.com/78jYWfIhCY
— Ethan Mollick (@emollick) September 27, 2023
It can create code for a website using a napkin drawing.
This is incredible how people use GPT-4 with Vision.
pic.twitter.com/1O3AW7DJDr— Hushen Savani (@hushensavani) September 27, 2023
It can analyze memes.
This is absolutely wild. I am completely speechless. pic.twitter.com/wGTAx1hFgS
— Pietro Schirano (@skirano) September 27, 2023
In addition to these examples, I ran a few simple tests.
GPT-4 with Vision can write product descriptions for your sales pages and Amazon listings.
It can help you get started with basic coding for a particular website design based on a screenshot.
It can write creative Instagram captions with hashtag suggestions.
It can write an article based on data from a website or ebook, such as the State of SEO 2024.
As with all AI-generated content, it’s essential to review output from GPT-4 with Vision for accuracy. It still hallucinates and poses other risks.
OpenAI Reveals Potential Risks Of GPT-4V
OpenAI released a paper outlining potential risks associated with the use of GPT-4V, which include:
- Privacy risks from identifying people in images or determining their location, potentially impacting companies’ data practices and compliance. The paper notes that GPT-4V has some ability to identify public figures and geolocate images.
- Potential biases during image analysis and interpretation could negatively impact different demographic groups.
- Safety risks from providing inaccurate or unreliable medical advice, specific directions for dangerous tasks, or hateful/violent content.
- Cybersecurity vulnerabilities such as solving CAPTCHAs or multimodal jailbreaks.
Risks posed by the model have resulted in limitations, such as its refusal to offer analysis of images with people.
Overall, brands interested in leveraging GPT-4V for marketing must assess and mitigate these and other generative AI usage risks to use the technology responsibly and avoid negative impacts on consumers and brand reputation.
OpenAI’s First Partner To Prepare Image Input For “Wider Availability”
OpenAI announced that the GPT-4 with Vision model will power Be My Eyes Virtual Volunteer, a digital visual assistant designed for the visually impaired.
Although the tech is still in beta, the possibilities are tantalizing. For example, this technology could assist businesses in elevating accessibility in customer service.
Be My Eyes plans to beta-test the feature with corporate clients, emphasizing its commercial potential beyond its primary audience.
The Future Of GPT-4 With Vision
The potential applications of GPT-4 With Vision for businesses, marketers, and SEO professionals could be groundbreaking.
However, all users should remain cautious due to the potential privacy, fairness, and cybersecurity issues posed by GPT-4 with Vision and other AI models.
In addition to image input capability, OpenAI reenabled the Browse with Bing feature for web browsing through ChatGPT.
Featured image: Tada Images/Shutterstock