Is anyone actually surprised by this?
Not excusing Chinese companies but everyone does the same shit. I bet a lot of US companies that behave the same or worse will be looking for trade barriers to protect their business so their interests will be stoking fear of Chinese competitors. I don’t really give a shit which country is doing it, I am not buying what they are selling.
US companies have a stranglehold on government, education and business and are getting access to my families data despite my personal objections. Far more concerned about that than a Chinese service I have no intention of using.
Deepseek can at least be self hosted if you want AI in your life. I can happily live without it.
Anyone using DeepSeek as a service the same way proprietary LLMs like ChatGPT are used is missing the point. The game-changer isn’t that the Chinese company DeepSeek can compete with OpenAI and its ilk as an AI service provider—it’s that now any organization with a few million dollars to train and host their own model can compete with OpenAI.
On-prem vs. Cloud, basically. On-prem just magically got cheaper.
Onprem has always been cheaper. Cloud compute was the most successful marketing campaign I can think of.
Or open source groups can make a fully open repro of it: https://github.com/huggingface/open-r1
I’d like to look into that, how can I train an existing model further?
I’m only playing around with ollama, but like to do a bit more - mostly just to fulfill my needs to understand things - but have no idea where to start
You’re going to have to learn python.
Here’s a good overview: https://huggingface.co/docs/transformers/training
Python is not a problem
SW Dev is my job. Just never had real contact with AI before, besides playing around a bit.Thank you very much for the link!!
Edit: thank you very much again, that was pretty much exactly what I was looking for.
Don’t know how I missed to checkout huggingface. Thought of it always just as a github for models and didn’t bother checking for docs…
But that’s a great intro with simple tools/tutorials to get a grip on it, thanks!
This is probably only a problem with the online version. In contrast to google and openAI they, like meta, let you download the model and run it offline, where they can’t access any of this data I presume.
Right, the offline version (if you have the hardware to run it) is completely under your control, and no one can take that away from you. Honestly nice to see that happen, I thought it would take several years.
I’ve been running it locally using ollama, works completely offline, no keystroke data for anyone!
Yeah I scan logs and so far nothing… I still don’t trust them but I can’t tell shit either
Just use little snitch, open snitch or simple wall depending on your operating system and block the outbound connection if one ever occurs
portmaster?
I heard about little snitch, is there any benefit to it v portmaster in your opinion, off the cuff type thing?
Oh little snitch was just what I used when macOS was my main operating system. When I switched to windows I started using simple wall and I just recently was poking around for a Linux solution and I found open snitch
DeepSeek does the same things that OpenAI does, but it’s a foreign actors so OOooooOOWwwwooOOOO sCaRy!
Nope you can’t run chatgpt locally.
Wait until they hear what data Instagram/Meta collects during use!
But they’re a US company so it’s ok.
Strawmanning the open source federated social media enthusiast crowd as unaware fans of meta?
As a US citizen, I support some things that help rich people in the US win when competing with rich people in China.
As a chauvinist
Ftfy
This makes me sad, that we can’t engage in civil discussion about this. Why did you assume and not ask questions? Be curious, not judgmental.
To me it’s a question of laws. The laws of the U.S. at least somewhat constrain the people of my own country, and can prevent them from working against their own citizens. Like me.
Please be kind when replying.
Fuck civility, its a tool of oppression
Realistically what is the worst thing China is doing with your private data? Selling it? If you’re not a Chinese National, at least you don’t fall under their jurisdiction.
If you’re a U.S. citizen, with all the tech oligarchs cozying up to the current administration, I’d be a lot more concerned with Facebook/Twitter/Etc collecting your data.
Realistically what is the worst thing China is doing with your private data?
Probably mapping out the extended support networks of democratic activists in Taiwan to prepare to throw them in jail after a forcible military takeover.
So democratic activists in Taiwan have extensive networks in the US?
I mean, you said it.
Extensive networks with their close ally? My pearls must be clutched!!
Networks with a foreign actor undermining national sovereignty, which financed several massacres in your country
My country? Not sure what you’re talking about but I know that Taiwan deserves sovereignty. You don’t? Surely you’re not pro imperialism…
The CCP is significantly more oppressive, gives zero shits about human rights or trademarks or really anyone at all. The US at least pretends to care.
The US is in the process of deporting all its migrants and threatening invasions on half the world.
I get that gringos don’t want to own up to their complicity by inaction but you oughta stop pontificating about how other governments are worse. Unless they’re called Israel, they weren’t before and they sure as fuck aren’t now.
Get fucked, racist.
Lmaooo hurting gringos feelings is being racist? Y’all have had concentration camps for longer than you’ve been without them, you know their fucking addresses and they’re still there.
Do forgive me for throwing y’all’s opinions on racism in the dustbin.
You cannot be a serious leftist and pretend to be offended by a little “anti-white” rhetoric.
that’s pure ideology.
Based on what? The US imprisons more people, kills more people, tortures more people. The only way to argue that China is more oppressive is basically to start with the assumption they are and then work backwards to justify it.
I listed a handful of reasons above, of which no one has denied or refuted. Just downvoted.
Actually you didn’t. You listed a bunch of accusations against China (which were refuted, you just ignored that), but you didn’t even try to explain how that’s more oppressive than the USA. Even if all your accusations were true, the US is still more oppressive.
I see you are sticking with the pack here and going with generic denial and ignoring my arguments rather than actually refuting them.
now we’ve got another refutopolis warrior.
What does that even mean?
Bro you can stop that narrative. The truth is out now.
…which truth came out? They don’t have CCP officials required by law to work at tech companies and disclose any and all data they acquire? They’re not using Uyghur slaves in their factories? They’re not trying to literally erase Taiwan off the maps? They’re not still censoring information about their horrific pasts? They’re not targeting, retaliating against and kidnapping protestors domestic and abroad? They’re not censoring virtually every US social website entirely from the entire country? Please bring me up to speed.
E: any of you downvoters, feel free to correct me, I’d love to be wrong.
You throw a bunch of claims with zero source and wants to be taken seriously. At least give us the bare minimum before just spewing this much US State Department propaganda.
That being said, I will address some of your points, since someone else might stumble upon this and need an actual answer.
They don’t have CCP officials required by law to work at tech companies and disclose any and all data they acquire?
Keeping a close look on the companies on their country and keeping them on a short leash is good actually. China is not a capitalist hellhole like the US or most of the world, it is a socialist state where the rich does not control the government. Keeping them in check is the right thing to do given their current development level of socialism.
They’re not using Uyghur slaves in their factories?
That’s a new one, so far I have only heard about how they are being genocided. Which you can debunk with a little bit of research: Arab League’s visit to Xinjiang rejects Western accusations of ethnic genocide, religious persecution.
They’re not trying to literally erase Taiwan off the maps?
LMAO, no. Taiwan is part of China, why would China want to erase part of itself off the map? Even the US agrees. The only thing China wants is proper reunification with Taiwan.
They’re not still censoring information about their horrific pasts?
What “horrific past”? Be specific, this vague stance achieves nothing. If you’re talking about Tiananmen Square, here’s a good video about that: The Tiananmen Square “Massacre” Never Happened.
They’re not targeting, retaliating against and kidnapping protestors domestic and abroad?
Again, provide a damn source, I have no idea of what you’re talking about and it is something I never saw anyone claim before.
What I can do tho is bring into attention the names of a few people like Huey P. Newton being killed by the US government and Snowden having to seek asylum abroad after blowing the whistle on the US surveillance state for the world to see. And if that’s not enough, how about Pro-Palestinian protesters clash with US police on second night of DNC and New Report Details How Pro-Palestinian Protests Are Suppressed in Democratic Countries.
They’re not censoring virtually every US social website entirely from the entire country?
No they are not, Microsoft operates in China. Not only that, but they do not explicitly want to simply ban US sites on there, it’s a simple matter of national sovereignty where companies like Facebook and Google refuse to abide by Chinese law, so China simply developed all their tools in-house. Not only that, but Chinese citizens have access to VPNs and can easily access websites abroad that are not usually allowed in China.
Meanwhile the US banned Huawei and tried to ban TikTok when it became apparent they could not control it and that the people were seeing the US for what it truly is, a genocidal state funding Israel in it’s attempt to genocide the Palestinian people.
The last link I posted is a proxy on 12ft.io since The Intercept won’t allow to see the page without registering.
I’m not here to defend the Chinese government or anything, but there is an argument to be made that the US has an equivalency to each one of these things.
CCP officials at tech companies - NSA backdoors
Uyghur slaves - Prison labor aka war on drugs
Taiwan - Gaza/Literally any “3rd world” nation with oil
Censorship - Right wing media empires/red state bills targeted to downplay US atrocities taught in schools
Retaliation against protestors - Police brutality Social media censorship - Oligarchs owned social media
I think a lot of people are less falling for Chinese propaganda and more overcoming US propaganda.
If you think any of those are remotely the same, you’re simply delusional.
With the caveat that we have tons of actual evidence for the US equivalent, whereas the claims that China does those things are usually “We absolutely swear they do bro” from the people who swore Hamas was raping babies or whatever.
You make a compelling argument.
’d love to be wrong.
No you wouldn’t. If you were, you’d have listen to the many people that probably have corrected you on all those State Department talking points
That’s never happened. And being that you haven’t either, I think it’s a fair guess that it won’t anytime soon.
The truth is out now.
What truth? Who talks like this and thinks it means something?
For the past week the people of China and the United States, as well as other countries have been comparing notes. Debunking propaganda on both sides. Realizing that much of what we’ve all been told for years/decades, has been lies.
I’m ootl. What debunks have come out?
That doesn’t affect people not in china or not bordering china.
deleted by creator
That sinophobia isn’t going to stoke itself!
Pathetic
Western authorities have been harvesting data for a few decades from social media so any complaint that singles out Chinese apps doing the same is obviously rooted in sinophobia.
The fact you think my joking about racists doing that is pathetic shows which side of that assertion you fall.
My content on here speaks for it self… Dear
rAyCIsM 🤡
but it’s a foreign actor so OOooooOOWwwwooOOOO sCaRrRey!
I love that people think this is a solid own. Lest we forget Hong Kong, or an impending hot war in Taiwan or building out extradition systems with an expanding network of countries to forcibly repatriate and torture dissidents and human rights lawyers.
You used to not have to explain why authoritarianism was bad.
or an impending hot war in Taiwan
When you can’t even find things that China actually has done to complain about, so you have to start complaining about things they haven’t done.
Anti terrorism is good, actually. I don’t support people kicking seniors for speaking mandarin to try to bully a government into not prosecuting murderers in the mainland, which was the reason the protests happened (that and Washington money)
It used to not be necessary because democracies used to have moral authority but since the revelations of Manning and Snowden non-Americans see no difference between giving our data to the USA or to China or any other. We also know from the reaction to the war in Ukraine and Gaza that human rights claims are only sometimes used.
I’m not American so they are indeed a foreign actor.
They should store the data in US servers like OpenAI does. Apparently then Mashable won’t write an article about it.
The criticism thrown at DeepSeek in the past days is just as applicable to American AI models. But when that was brought up it in the past it was “making things political”.
At least I can run DeepSeek locally.
This article is what US propaganda looks like folks. Mashable should be ashamed.
Literally all AI companies do this to run their services. Except you can actually download Deepseek and run it completely securely on your own devices. You know who doesn’t allow that security? OpenAI and the other US companies currently being screwed.
every google site has been doing this for years too. every comment we write in youtube and discard before posting, its being recorded. this isnt news at all.
Yes, I’m going to be lectured on privacy by people who are still on twitter.
Oh my, just wait until you learn what Facebook and Google do…
The Chinese now have data on my Linux vm and my curiosity about sweet potato and sweet potato recipe. They’re coming for me now!
Just host it yourself?
You can’t just host the 632B model that the app uses lol
If you have the hardware, then yes, you can.
Ah, just acquire such hardware, very simple and anyone can do it without supply chain knowledge or advantage
Sorry but you are just talking assumptions without even having looked at the facts.
Its not cheap, but basically a single toptier gaming desktop with an additional graphics card (or 2) is literally all you need.
I know multiple people who work normal IT jobs that have already started on setting up their own. They plan on running them for their whole family, many users at a time from the same machine.
Here is someone who got it to work on a cluster of mac-minis. Again not cheap, but clearly within dedicated consumer enthusiast reach. https://digialps.com/deepseek-v3-on-m4-mac-blazing-fast-inference-on-apple-silicon/
And this is before even considering how fast open source moves, i am expecting quantized models which can have double speed for negligible quality impact any second now.
Building my entire data model around the Tienanmen Square copypasta. I can run this thing on a Raspberry Pi plugged into a particularly starchy potato and it reliably returns the only answer I’ve thought to ask it.
By extension, anything that’s not self hosted means 3rd party actors snooping. American, Chinese, whoever happens to operate that machine.
Sorry if this is a dumb question, but is the accusation that they collect keystroke data from outside the app if you have it installed?
It’s not possible unless deepseek have accessbility permission or Deepseek become Keyboard app instead of AI app xD.
I haven’t seen any indication of that, no.
So I won’t use this for the same reason I don’t use any AI? Cool
Idk DeepSeek probably just stores things in the history of my Terminal window.
the company states that it may share user information to "comply with applicable law, legal process, or government requests.
Literally every company’s privacy policy here in the US basically just says that too.
Not only does DeepSeek collect “text or audio input, prompt, uploaded files, feedback, chat history, or other content that [the user] provide[s] to our model and Services,” but it also collects information from your device, including “device model, operating system, keystroke patterns or rhythms, IP address, and system language.”
Breaking news, company with chatbot you send messages to uses and stores the messages you send, and also does what practically every other app does for demographic statistics gathering and optimizations.
Companies with AI models like Google, Meta, and OpenAI collect similar troves of information, but their privacy policies do not mention collecting keystrokes. There’s also the added issue that DeepSeek sends your user data straight to Chinese servers.
They didn’t use the word keystrokes, therefore they don’t collect them? Of course they collect keystrokes, how else would you type anything into these apps?
In DeepSeek’s privacy policy, there’s no mention of the security of its servers. There’s nothing about whether data is encrypted, either stored or in transmission, and zero information about safeguards to prevent unauthorized access.
This is the only thing that seems disturbing to me, compared to what we’d like to expect based on the context of what DeepSeek is. Of course, this was proven recently in practice to be terrible policy, so I assume they might shore up their defenses a bit.
All the articles that talk about this as if it’s some big revelation just boil down to “company does exactly what every other big tech company does in America, except in China”
Collecting keystrokes is very different from collecting text inputted into fields. Keystroke rhythms is even more alarming as that is often used to identify users despite them using privacy settings, or used to collect what’s typed via audio collection.
Your argument that this is no different than other apps is complete crap. Don’t trust any app that collects that information
The argument stands, though.
Yes, not ALL other apps do that, but the comment was specifically talking about companies like Google and Meta… they definitely do collect incomplete strings from forms, down to individual characters when they display search suggestions, for example. They might not mention “keystrokes” in the legal text, but I don’t see why they wouldn’t be able to extrapolate your typing pattern since they do have the timing information which should be enough data to, at some level, profile it.
Keystrokes don’t have to be in a text field or input. That’s my point.
If I’m on say google. And I type anything into the field it’s definitely capturing it. You know this for no other reason then it would have to be with autocomplete as an option.
Keystroke capturing is the same as keylogging, aka anything typed even if it’s not into a place where you would assume it’s being seen by the app. Aka, if I had an app open in the background and was typing in my password, it would see and capture that.
They’re completely different things. While the privacy issues of US large tech companies are abundant and awful, there is a large difference between keystroke capturing and capturing input via fields. Especially when you’re agreeing to allow them to process and transfer or even sell that information.
But that’s not what the terms on both Google/Meta and Deepseek say.
Google/Meta has no obligation to restrict the data collection to forms, if the ToS allowd them to collect them from forms (and as you admited, we do know for a fact that they do), then there’s no reason it also does not allow them to collect them outside of forms (which we don’t, for a fact, know).
In the same way, Deepseek terms don’t say the logging happens for “anything typed” like you are assuming without evidence. For all we know the only place they might be capturing it is exclusively in very specific forms, or they might even only added that to the terms so that they can add suggestions in the future. You can only make assumptions, since the terms are not specific on exactly what’s being captured and in which way, it only says keystrokes in the case of Deepseek and even more generic (and thus allowing more possible vectors) in Google/Meta’s terms.