- cross-posted to:
- technology@lemmy.world
- cross-posted to:
- technology@lemmy.world
Reddit said in a filing to the Securities and Exchange Commission that its users’ posts are “a valuable source of conversation data and knowledge” that has been and will continue to be an important mechanism for training AI and large language models. The filing also states that the company believes “we are in the early stages of monetizing our user base,” and proceeds to say that it will continue to sell users’ content to companies that want to train LLMs and that it will also begin “increased use of artificial intelligence in our advertising solutions.”
The long-awaited S-1 filing reveals much of what Reddit users knew and feared: That many of the changes the company has made over the last year in the leadup to an IPO are focused on exerting control over the site, sanitizing parts of the platform, and monetizing user data.
Posting here because of the privacy implications of all this, but I wonder if at some point there should be an “Enshittification” community :-)
When I go to some reddit posts on Mobile now (like from a Google search, that’s the only way I end up at reddit anymore), it tells me “this content is unmoderated” and gives me a choice to either navigate away or install the Reddit app. Fuck that noise.
Try this, in either Bing/Copilot AI or Google Gemini: Start your prompt with “According to Reddit”, then do your search like you would by using search alone.
The AI of your choice will scrape the posts and give you a nice summary of whatever you were searching for - no need to ever touch Reddit directly.
For me, this works better with Copilot, YMMV.
Example: “According to Reddit, what is the best mechanical keyboard brand to use for touch typing?”
or i can just add “site:reddit.com” to a normal search. meh.
Does that allow you to bypass the “open in app or navigate away” wall?
I never see that because all my devices are setup to redirect to old.reddit.com
Absolutely! What I am suggesting here is: since Reddit is so gung ho on AI, use the AI to bring them to their knees, and have some fun while doing it. 😬
how exactly do you think that would bring reddit to their knees?
Change the URL to old.reddit.com as the domain
Read: that means things are gonna get much worse around here
When Reddit go public we gonna see some serious shit.
Yeah as I have already written the site off, at this point I just kinda wanna see how bad it gets how fast.
“we are in the early stages of monetizing our user base,”
If anyone on Reddit reads that and stays there willingly they are an idiot. Not they weren’t idiots for staying after the API changes but now they are even bigger idiots.
They’ve finally gone full /HailCorporate, become the thing some of the original people of the site would probably not have agreed with in many ways
That is a story as old as time. Greed is strong.
“Early Stages?” You’ve got AI mining your data. The Lions have already come and gone. The hyenas and other scavengers are picking over the scraps, now.
They mean that they havent made money on it (yet)
They have probably only provided a small amount of available data, and have much more data, of different type.
Yes we’ve got the data, but now we need it from different angles!
See, most companies would do this before they went public.
Reddit has long had an issue with confidently providing false statements as fact. Sometimes I would come along a question that I was well educated on, and the top voted responses were all very clearly wrong, but sounded correct to someone who didn’t know better. This made me question all the other posts that I had believed without knowing enough to tell otherwise.
Llms also have the same issue of confidently telling lies that sound true. Training on Reddit will only make this worse.
@Fubarberry yes I saw this a lot too. Highly upvoted confidently incorrect comments, with the real answer or an answer debunking them with links to factual sources less upvoted.
Happened to me as well.
I am a lawyer and I would get down voted for posts explaining the law that contained citations to the actual applicable statute if people didn’t like the statute. Using reddit up votes as a measure of correctness is fundamentally a dumb idea.
@collapse_already yeah Reddit also tended to mistake explanation for agreement and savagely downvote it.
I would come along a question that I was well educated on, and the top voted responses were all very clearly wrong, but sounded correct to someone who didn’t know better.
This can be said to https://news.ycombinator.com/ as well. I wonder how much of this is due to sock puppets and bots.
The problem is that SEO has made it impossible to find accurate information easily, since even “old, trustworthy brands” can’t be trusted online. [This is an excellent article that explains the problem thoroughly, and brings receipts] (https://housefresh.com/david-vs-digital-goliaths/).
That’s a really good article, and it does a good job of highlighting the issues with modern day search results.
I’ve been guilty to use “best x” pages before, but if the website with the “best of page” doesn’t have specific reviews linked I usually look up individual product reviews for the good sounding items on other websites.
This is a great example of why it’s so important to emphasize teaching critical thinking in school right now. Misinformation and disinformation is just going to continue to grow.
Literally why I bookmarked it. I’m an online teacher, so I’m going to advocate for adding that article to a grade 10 course that’s used by thousands of students each year.
I’m a student teacher right now in elementary! I try to get my kids to think critically whenever I can. I hear kids talk about insane shit they saw/heard on tiktok (I got into an argument with a student who thought Slenderman was 100% real because of something they saw on tiktok) and I try to really get them to think and actually justify why they believe things.
An uphill battle for sure. I wish you the best of luck.
Somewhat related:
A recommendation about teaching controversial topics: you need to build connection first.
I mean, that’s true of all teaching, but when you start to question the (prejudiced) things they’re hearing from trusted adults at home, you really need to have a strong relationship with the students.
Being an anti-racist pro-SOGI educator in conservative communities is hard.
I wish you success in your career! Teachers have such an opportunity to make a huge impact on the world.
Great article, thanks for mentioning it!
Yeah all of my most down voted reddit comments were the ones where I replied about something I’m an actual expert in. Scary stuff
I spent 20 years as a producer, developer, and project manager in the lottery and games industry.
Trying to explain how lottery and games work to people and have them hear me makes me want to cry.
Fascinating! I’d love to hear a little about it, if you don’t mind.
Certainly, I’m always happy to share with inquisitive minds.
Is there any particular question you’d like me to address?
Not really, I never paid much mind to it. I’m curious about the whole industry I guess, or anything you’d like to share or set the record straight about.
Oh there’s lots I have to set the record straight about and there’s lots I could talk about, but without being asked a specific question that would just leave me to write an open-ended essay and I’m not up for it right now
Downvoting was always just fast food validation that you’re better than someone else without having to actually back it up.
Wow. You’re extremely on point. No logical counterarguments but rather several downvotes for a field I’m very familiar with. Downvotes determine the validity of a comment, not their content.
The voting system let’s people push comments to the top that they want to be true, not necessarily things that are true.
There’s also the issue of reddit comment sorting being entirely dominated by time. In something like 90% of posts, the top comment is one of the first five. Literally all you have to do is just comment first, and it’ll likely be the top.
This tends to give more influence to people who spend more time on it and write more. And they are less likely to be subject matter experts.
Because it’s like old forums where the first person to comment gets engagement
Some of the better subreddits tried to mix it up and change how this affected upvotes. There was Muxing,…etc etc… But then,… Spez came in (back) and didn’t give af about anything at all except money.
First time I’m hearing about this, can you give any links? Maybe we could use something similar in lemmy
Muxing upvotes , “balances”, etc.
Even hiding all upvotes of every comment thread until ~12 hrs after posting.
I noticed from the beginning that Lemmy’s default comment sorting improves visibility of a variety of comments including newer ones. Gee, I wonder who could have helped make it that way ;)
Over the years I ended up getting a Reddit habit of replying to one of the top comments so that it could attain some visibility. I still do sometimes but less often on Lemmy.
I strongly agree with this comment. To show my appreciation, you have my upvote. Had I only agreed a little bit, I might have not voted at all. If that comment had made me angry, I might have downvoted.
Actually calling these things votes instead of likes makes a lot of sense. I might not like a comment, but I might want it to be higher. I might not hate another comment, but I might want it to be lower because of other reasons.
but sounded correct to someone who didn’t know better
specious /spē′shəs/ adjective
Having the ring of truth or plausibility but actually fallacious. "a specious argument."
and then the real answer will be hidden or something silly, or in some cases where money is involved the correct answer might have been removed
Looks like the enshitification of Reddit is about to accelerate. I barely use it anymore, but I kept my two ten year + old accounts intact (one for porn one for legit posts). I’ll probably nuke my non-porn account soon.
You know the phrase “If you aren’t paying, you’re the product”.
It doesn’t hit as hard as a CEO using the phrase “Monetizing Our User Base”.I wonder if they would use the data on all my old accounts that got banned for promoting violence against the billionaire class.
Don’t forget kids – all rights are won through violence.
The forgotten truth nobody wants to remember.
“Pay-Per-Click”, is all this is when you break it down to its basest.
Narwhal developers have come out and said that they have to pay beforehand for clicks to the API—- what absolute bullshit Reddit and Spez are bringing to the trough. Spez killed reddit—- calling it now; a slow painful lingering shitty death.
People will not put up with it once they know what is really going on.
Let em know. “Pay-Per-Click” will not stand.
People will not know what is really going on as they do not care. Reddit will continue to exist.
Ah
Yes
I know Fark and /. and MySpace, and still exist
Fuck u/spez
You know what the world doesn’t need?
an AI model trained on the old Reddit Hive Mind.
Some AI models already argue when people point out inaccuracies, just like on Reddit.
Guess what data they’re trained on…
Makes me wonder how that technology is going to track. Reddit isn’t bad for finding niche answers to niche questions, but if you import the data wholesale then you’ll have a hard time separating the signal from the noise, even if you sort by using vote counts as relevance.
Reddit is valuable because people can do a search for a niche topic and find the answer on that forum. And the answer was written by a human. It’s not valuable because it can amalgamate an approximation of those answers that might be 90% true and 10% dead wrong.
As someone with expertise in some niche fields:
They’re almost always wrong about everything, and when someone tries to correct them, with sources, they get downvoted.
This is a human thing and not so much a reddit thing. People been arguing on the internet since the inception of message boards.
I disagree. A reddit bot would be really funny as it would constantly talk about incest and spez
That and the feeling of pride and accomplishment.
It took them how many years to monetize their user base? This company is run by complete idiots.
Given that Spez managed to write himself a $193M cheque, I’d say it’s idiots all the way down.
Is this a long term source of revenue for Reddit? Or will it loose value at some point, simply because LLMs are all trained sufficiently on user generated content. Is there more to learn at some point?
Also it seems that a lot of content on Resdit is already AI generated, so it would train on data from other LLMs, which I’m sure doesn’t improve quality.
It’s the reason I can’t see this stock maintaining or improving its price after the IPO. I mean, sure, there will probably be a short term gain for a few stock holders. But, I just don’t see how it doesn’t tank afterwards. I mean, in the end, Reddit is Reddit. It’s just an aggregation site, how can it grow in value? The fediverse is slowly but surely gaining popularity. And even though Reddit calls itself the front page of the internet, it really isn’t.
*Not investment advice. Good god please don’t take investment advice from me. Knowing my luck that fucking stock will soar to Wall Street record highs, beating out Bitcoin by a large margin.
It’s just an aggregation site, how can it grow in value?
Supposedly in Reddit finance there’s something called the “Anarchy Chess/Ewan gambit”. If you post one grain of rice, and double it each time you reach a threshold you can farm near-infinite updoots! Probably works the same with money, idk.
LLM’s are a parasitic entity. They can only operate as long as they have a living host (us) on which to draw data. Without their host, they rapidly start hallucinating. Hell, the other day ChatGPT (and every business that relied on it) started hallucinating for no apparent reason.
The thing about the parasite is, though, that it endangers its host. At some point, the fact that anything you say can be plugged into a machine with no credit given back to you, will encourage creative people to stop bothering being creative, depriving them of income or even exposure.
It’s a funny thing, a few years ago I would say that the “anything you post here can be sold by us” clause on social media was very unlikely to get exploited, as nobody knew how to sell data en masse to make money off of it. I guess now we know that’s not true at all. If something bad can happen with your data… It will.
Well, eventually LLMs will need to be fed new misinformation at some point, such as which minority was responsible for their own genocide