FairPrice Whitepaper 2025
marketing interactive Content360 Singapore 2026 Content360 Singapore 2026
AI chatbots are getting 'aggressively' data hungry

AI chatbots are getting 'aggressively' data hungry

share on

Popular AI chatbot apps - including Meta AI, Google Gemini, and ChatGPT - collect a wide range of user information such as contact details, search and browsing history, and other user content. Meanwhile, 70% of popular AI chatbots now collect user location data, up from 40% last year, according to an analysis by Surfshark.

Surfshark identified the 10 most popular AI chatbots and analysed their privacy details on the Apple App Store. The comparison was based on how many types of data each app collects, whether any data is linked to users, and whether the app includes third-party advertisers. Surfshark also reviewed the privacy policies of DeepSeek and ChatGPT to better understand what data is retained on servers and for how long.

According to the analysis, Meta AI continues to collect the most user data among the apps examined, gathering 33 out of 35 possible data types—nearly 95% of the total. It remains the only app that collects data in the financial information category.

Meta AI, alongside Google Gemini, also collects sensitive information, which includes racial or ethnic data, sexual orientation, pregnancy or childbirth information, disability, religious or philosophical beliefs, trade union membership, political opinion, genetic information, and biometric data.

Google Gemini collects 23 out of 35 possible data types. Gemini also collects a significant amount of data across various other categories, such as users' names, email addresses, phone numbers, amongst others, user content, contacts, search history, browsing history, precise location, and several other types of data. This extensive data collection may be seen as excessive and intrusive by those concerned about data privacy and security.

According to the Apple App Store, ChatGPT may now collect 17 out of 35 data types, according to the developers. This represents a 70% increase from the 10 data types identified in last year's AI chatbots review, indicating a notable broadening in the extent of user data collection. The additional data types now collected include coarse location, health and fitness, search history, audio data, advertising data, and customer support.

Most of the data types collected by ChatGPT (14) are intended for app functionality. However, the user information may also be used for other purposes, including analytics (7), product personalisation (4), developers’ advertising or marketing (3), and third-party advertising (2). Notably, health and fitness data, as well as advertising data, are not required for app functionality.

Claude, the fourth most data-hungry chatbot, collects 13 out of 35 data types, each of which is crucial for app functionality. These data types support activities such as authenticating users, enabling features, preventing fraud, implementing security measures, maintaining server uptime, reducing app crashes, improving scalability and performance, and delivering customer support.

However, many of the data types collected by Claude may also be used for other purposes, such as analytics and developers’ advertising or marketing, indicating a fairly extensive exploitation of user data. This includes data such as user coarse location or content such as photos or videos. 

DeepSeek is the fifth-hungriest chatbot and collects 13 unique types of data, including user input, chat history, and claims to retain information for as long as necessary, storing it on servers located in China. 

Unlike other AI chatbots which operate under US federal law and collaborate with regulatory bodies, DeepSeek is not subject to comparable legal frameworks such as General Data Protection Regulation (GDPR), according to Tomas Stamulis, chief security officer at Surfshark, adding that this lack of oversight further increases concerns about accountability and data protection.

“Chatbots are becoming increasingly aggressive with user data. Our research shows that 70% of popular AI apps now collect location data, a sharp rise from just 40% last year. This surge in data hunger is also evident in platforms such as ChatGPT, which recently increased its collection by 70% to include everything from health and fitness metrics to search history and audio data," explained Stamulis.

"Unlike traditional search engines, these bots now handle highly sensitive uploads such as tax documents and medical records, which can be shared across massive third-party networks for targeted ads. To protect your privacy, you must treat every prompt as a public record: audit your settings, disable chat history, and never share what you wouldn't want to be publicly known," he added. 

Mark your calendars for 24 June! #Content360 Hong Kong returns with a dynamic, one-day event dedicated to pivotal trends—from the silver economies to breakthrough IP collaborations, sports, and beyond. Let's dive into the art of curating content with creativity, critical thinking and confidence!

Related articles:

The end of the blank page: What Google Gemini in HK means for marketers
ChatGPT's parent company OpenAI to open SG office amidst regional expansion

share on

Follow us on our Telegram channel for the latest updates in the marketing and advertising scene.
Follow

Free newsletter

Get the daily lowdown on Asia's top marketing stories.

We break down the big and messy topics of the day so you're updated on the most important developments in Asia's marketing development – for free.

subscribe now open in new window