Advertisement
  1. SEJ
  2.  ⋅ 
  3. Generative AI

Google Updates Privacy Policy To Collect Public Data For AI Training

Google's updated privacy policy allows the company to scrape public data to improve its AI models.

  • Google can now collect and use public data to improve its AI systems.
  • Users should be thoughtful about what information they share online.
  • Balancing AI progress and privacy protection will require effort from both tech companies and users.
Google Updates Privacy Policy To Collect Public Data For AI Training

Over the weekend, Google updated its privacy policy to allow the company to collect and analyze information people share online to train its AI models.

Google says it will use this information to improve its services and develop new AI-powered products.

The update to Google’s privacy policy reads:

“Google uses information to improve our services and to develop new products, features, and technologies that benefit our users and the public. For example, we use publicly available information to help train Google’s AI models and build products and features like Google Translate, Bard, and Cloud AI capabilities.”

Here’s more about the shift in policy and what it could mean for internet users.

A Shift from “Language” Models To “AI” Models

The updated policy marks a clear shift from Google’s previous terms of service.

Before this weekend’s update, Google’s policy said it used people’s data to improve “language” models.

Now, Google reserves the right to use people’s data to improve all its “AI” models and products, including translation systems, systems that generate text, and cloud AI services.

Google highlights the changes in its privacy policy archive page (green represents newly added information):

Screenshot from: policies.google.com/privacy/, July 2022.

Typically, privacy policies restrict companies to collecting data that users provide directly. With Google’s new policy, the company can use any information people post publicly online.

Privacy Concerns

Using AI systems to analyze people’s online posts raises privacy concerns.

AI technologies such as Google’s Bard and OpenAI’s ChatGPT may be taking in and reusing people’s posts, reviews, and other online content.

Although anything posted publicly online can be seen by anyone, how that information might be used is changing. The main concern is shifting from who can access the data to how it could be utilized.

Moreover, the legality of this data collection method is still up in the air.

As we move forward, expect courts to grapple with complex copyright issues.

Web Scraping

The issue of web scraping has caught the attention of high-profile tech figures like Elon Musk, who has been vocal about his concerns, even blaming several recent Twitter mishaps on the platform’s efforts to prevent data extraction.

Over the weekend, Twitter curtailed the number of tweets users could view per day, rendering the service nearly unusable. Musk attributed this to an essential response to “data scraping” and “system manipulation.”

The prevalence of tech giants’ web scraping practices is now a pivotal discussion in consumer data use and privacy debate.

How To Protect Your Data

If you’re concerned about the changes to Google’s privacy policy, here are some steps you can take to prevent Google from using your data to train its AI systems:

  • Only post information publicly that you are comfortable with anyone, including Google, accessing and using.
  • Use Google’s privacy controls. Go to your Google Account and review your privacy settings. You can choose to opt out of options like “Web & App Activity,” “Location History,” and “Voice & Audio Activity.”
  • Use alternative services. Instead of using Google services like Search, Gmail, YouTube, Chrome, etc., you can switch to alternative providers with stricter privacy policies. Options include DuckDuckGo for search, ProtonMail for email, Vimeo for video sharing, and Brave for web browsing.
  • When using Google services, enable incognito or private browsing mode.
  • Read the privacy policies for websites, mobile apps, or other services before using them. Be careful with those stating they share your data with Google.
  • Contact Google directly to express your concerns over how your data may be used to train its AI models.

In Summary

Google’s update allowing the company to collect and analyze public data to train its AI systems highlights important issues.

First, as AI technologies become more advanced, tech companies have an increasing appetite for data. However, this data collection should be done legally and ethically, with users’ consent and knowledge about how their information is used.

Second, people should thoughtfully decide what they share online and be aware that public posts may be utilized in ways that are hard to foresee.

While AI promises many benefits, it introduces new challenges we must overcome to build a responsible future with AI.


Featured Image: Ian Dewar Photography/Shutterstock

Category News Generative AI
ADVERTISEMENT
SEJ STAFF Matt G. Southern Senior News Writer at Search Engine Journal

Matt G. Southern, Senior News Writer, has been with Search Engine Journal since 2013. With a bachelor’s degree in communications, ...