Is anyone actually surprised by this?
Time to make up a conspiracy about key leading figures, and laugh as they disappear each other.
Everyone must ask to see Xi jing jing ping pong nudes! But without mentioning Xi or nudes.
That would be a great way of poisoning their plans.
Isn’t it open source? If so it should be near trivial to get rid of all of that.
If it’s closed source I wouldn’t touch it with a tej foot pole, it’s the same reason I rarely use chat gpt, it’s just freely giving away your personal data to open AI.
Let the FUD begin!
I trust DeepSeek Open Source if it allows me to copy and review it. I don’t trust
OpenAI like ChatGPT.Is Deepseek Open Source?
Hugging Face researchers are trying to build a more open version of DeepSeek’s AI ‘reasoning’ model
Hugging Face head of research Leandro von Werra and several company engineers have launched Open-R1, a project that seeks to build a duplicate of R1 and open source all of its components, including the data used to train it.
The engineers said they were compelled to act by DeepSeek’s “black box” release philosophy. Technically, R1 is “open” in that the model is permissively licensed, which means it can be deployed largely without restrictions. However, R1 isn’t “open source” by the widely accepted definition because some of the tools used to build it are shrouded in mystery. Like many high-flying AI companies, DeepSeek is loathe to reveal its secret sauce.
The runner is open source, and that’s what matter in this discussion. If you host the model on your own servers, you can ensure that no corporation (american or Chinese) has access to your data. Access to the training code and data is irrelevant here.
The guys at HF (and many others) appear to have a different understanding of Open Source.
As the Open Source AI definition says, among others:
Data Information: Sufficiently detailed information about the data used to train the system so that a skilled person can build a substantially equivalent system. Data Information shall be made available under OSI-approved terms.
- In particular, this must include: (1) the complete description of all data used for training, including (if used) of unshareable data, disclosing the provenance of the data, its scope and characteristics, how the data was obtained and selected, the labeling procedures, and data processing and filtering methodologies; (2) a listing of all publicly available training data and where to obtain it; and (3) a listing of all training data obtainable from third parties and where to obtain it, including for fee.
Code: The complete source code used to train and run the system. The Code shall represent the full specification of how the data was processed and filtered, and how the training was done. Code shall be made available under OSI-approved licenses.
- For example, if used, this must include code used for processing and filtering data, code used for training including arguments and settings used, validation and testing, supporting libraries like tokenizers and hyperparameters search code, inference code, and model architecture.
Parameters: The model parameters, such as weights or other configuration settings. Parameters shall be made available under OSI-approved terms.
- The licensing or other terms applied to these elements and to any combination thereof may contain conditions that require any modified version to be released under the same terms as the original.
These three components -data, code, parameter- shall be released under the same condition.
Chinese company does what American companies have done for 25+ years now!
Is it time for REAL data privacy laws or are we just gonna keep playing whack-a-mole with Chinese tech companies that get us nowhere?
Our data’s just too valuable for these parasites. Data privacy laws may eventually pass to compel software companies to store everything in US servers only.
Excellent Point. If that’s the case though, then wouldn’t other countries follow suit which still limits big tech’s reach and makes them less profitable and less powerful? Idk. Guess we’ll see how it plays out. Either way, I’m staying as far from those ecosystems as possible to at least try to mitigate some of what they do. I’ll never be totally successful, genie is put of the bottle, but we can at least attempt.
If you think the American companies do anything different you’re not paying attention and simply believing the propaganda.
Run it locally then
Thanks, managed to have it installed locally bia pocket pal (termux was giving me errors constantly on compile). Out of curiosity, I made a very “interesting” prompt, and frankly I am not even surprised
EDIT: decided to be a little spicier, didn’t fail to amuse me
That so edgy, man. I bet all the girls in your high school think you’re the raddest.
This is mildly pedantic but you’re not actually running Deepseek R1, you’re running a 7B version of Qwen that’s been fine-tuned on Deepseek R1 outputs. All of the “distilled” models are existing models trained on R1.
Nice catch. I’ll be sure after do run the real thing
I swear people do not understand how the internet works.
Anything you use on a remote server is going to be seen to some degree. They may or may not keep track of you, but you can’t be surprised if they are. If you run the model locally, there is no indication it is sending anything anywhere. It runs using the same open source LLM tools that run all the other models you can run locally.
This is very much like someone doing surprised pikachu when they find out that facebook saves all the photos they upload to facebook or that gmail can read your email.
The telephone company knows your phone number!
Same as Chrome’s magic bar, or android keyboard no ? So in the end, does USA doing it good because “democracy” (never ever with napalm) when China is bad because human rights violation (USA never did anything like this) ?
Seriously this. Nothing that China is accused of doing is any worse than what i know America has done. If it’s the Chinese Communist Party stealing your data at least you know it won’t be used to inject ads everywhere you go on the internet
At least they’re transparent about it, unlike american companies that hide behind convoluted terms of services and then sell the data behind your back but it’s technically legal.
China’s like “yeah we collect everything”. I can appreciate the honesty.
Is Deepseek Open Source?
Hugging Face researchers are trying to build a more open version of DeepSeek’s AI ‘reasoning’ model
Hugging Face head of research Leandro von Werra and several company engineers have launched Open-R1, a project that seeks to build a duplicate of R1 and open source all of its components, including the data used to train it.
The engineers said they were compelled to act by DeepSeek’s “black box” release philosophy. Technically, R1 is “open” in that the model is permissively licensed, which means it can be deployed largely without restrictions. However, R1 isn’t “open source” by the widely accepted definition because some of the tools used to build it are shrouded in mystery. Like many high-flying AI companies, DeepSeek is loathe to reveal its secret sauce.
No.
As opposed to Microsoft, Google, … NSA, or GCHQ servers. Or all of the above.
Yeah, but these are wholesome domestic spy agencies that are just looking after you and protect you from yourself.
Fuck. They told me that they were storing my backups.
I run it locally on a 4080, how do they capture my keystrokes?
No idea, have you monitored the package / container for network activity?
Perhaps this refers to other clients not running the model locally.
Yea, I am looking for a deep analysis of this as well
It doesn’t. They run using stuff like Ollama or other LLM tools, all of the hobbyist ones are open source. All the model is is the inputs, node weights and connections, and outputs.
LLMs, or neural nets at large, are kind of a “black box” but there’s no actual code that gets executed from the model when you run them, it’s just processed by the host software based on the rules for how these work. The “black box” part is mostly because these are so complex we don’t actually know exactly what it is doing or how it output answers, only that it works to a degree. It’s a digital representation of analog brains.
People have also been doing a ton of hacking at it, retraining, and other modifications that would show anything like that if it could exist.
Are you running a quantized model or one of the distilled ones?
Question: if we bridge 2 ais and let them talk to one another, will they eventually poison each other with gibberish bullshit?