Lower your internet bill
Enter your zip code to find the best deals & promos in your area.
1 Star is Poor & 5 Stars is Excellent.
* Required

Written by Rosslyn Elliott - Pub. Jul 30, 2024 / Updated Jul 30, 2024
Table of Contents
About the author
Last week, 404 Media broke the news that Reddit has blocked most major search engines from indexing its recent content.
The single exception is Google, as a result of the company’s agreement earlier this year to pay for Reddit’s content.
Though Google’s payment to Reddit may seem like a logical reason for its continuing access, the privilege will again give Google a massive advantage over competitors.
For a company already facing lawsuits for its monopolistic status, the outcome may cause more government crackdown in the long run.
Reddit recently updated its robots.txt file, a standard web protocol that tells search engines which parts of a website they can crawl and index. This change prevents web crawlers from accessing Reddit’s latest posts and comments, affecting a wide range of popular search engines.
Google apparently is using an authorized manual override to avoid the block.
The following search engines have been affected by Reddit’s new policy:
Bing
DuckDuckGo
Mojeek
Qwant
Baidu
Yandex
The search engine Kagi still has access to new Reddit data because of its previous agreement to purchase content from Google.

Users may find Reddit blocked
Google’s continued access to Reddit’s content stems from a $60 million deal struck earlier this year. This agreement allows Google to use Reddit’s data for AI training purposes, setting it apart from other search engines. Reddit forced the issue of compensation for its content by blacking out Google’s access in 2023 in protest of API changes.
While users and other search engine companies were quick to claim that the lock-out occurred because of the Google deal, Reddit denies that connection. “We block all crawlers that are unwilling to commit to not using crawl data for AI training, which is in line with enforcing our Public Content Policy and updated robots.txt file,” a company spokesperson said to Engadget.
Users can easily check if their preferred search engine is affected by entering “site:reddit.com" in a search box followed by a date range or sorting by recent results.
For blocked search engines, users will notice:
· No results from the past week
· Empty search result pages
· Outdated content (several years old)
· Messages stating the site won’t allow descriptions
For many internet users, adding “Reddit" to search queries has become a common way to find human-generated answers on topics ranging from tech support to personal advice. With the flood of AI information online, Reddit provides a valuable source of human input.
With this change, users who are looking for recent Reddit content will be limited to Google or search engines that pull from Google’s index.

AI scraping causes conflict
Web scraping is the automated process of extracting data from websites. Reddit has a no-scraping policy that forbids companies to scrape its data without compensation.
AI companies are just one type of organization that tries to scrape the web. Others include:
1. Search engines to index content
2. Researchers gathering data
3. Businesses monitoring competitors
AI companies try to scrape data in order to create new material based on that data. This use has created unprecedented controversy.
The aggressive scraping from AI has also caused concern for individuals, as more people wish to prevent their data from being used by AI. General anxiety about exposed personal information in the cloud is increasing with recent major hacks.
The use of online data for AI training has become a source of public debate and lawsuits due to several issues:
1. Copyright concerns: Many content creators argue that using their work to train AI models without permission or compensation infringes on their intellectual property rights.
2. Privacy issues: Privacy advocates have concerns about personal information being included in training data without consent.
3. Bias and representation: The data used to train AI can perpetuate or amplify existing biases in online content.
4. Economic impact: As AI models become more sophisticated, there are fears they could replace human workers, especially in customer service, retail, bookkeeping, and banking.
Reddit spokesperson Tim Rathschmidt stated to the Verge:
“We have been in discussions with multiple search engines. We have been unable to reach agreements with all of them, since some are unable or unwilling to make enforceable promises regarding their use of Reddit content, including their use for AI."
This statement suggests that Reddit’s primary concern is not just about compensation, but also about controlling how its content is used, particularly in AI applications.

Who owns web content?
Reddit’s move aligns with a growing trend of content creators and platforms seeking compensation for the use of their data in AI training.
Many web publishers feel that their survival depends on not allowing AI to take their content without payment.
Brent Csutoras, founder of Search Engine Journal, commented on the battle between AI companies and content platforms. “Publications, artists, and entertainers have been suing OpenAI and other AI companies, blocking AI companies, and fighting to avoid using public content for AI training,” Csutoras said in a LinkedIn post.
If Google is the only major search engine able to index recent Reddit content, there will be a new threat to the ability of other search engines to compete with Google.
Google’s market dominance has always loomed as a potential annihilator of user choice. The fact that an entire industry (SEO marketing) depends on Google’s algorithms shows the lack of a truly competitive landscape in the search engine market.
Reddit’s decision raises questions about the future of the open web. As more platforms restrict access to their content, it could lead to a more fragmented internet where information is siloed within specific ecosystems.
As the situation unfolds, several questions remain:
1. Will other search engines eventually strike deals similar to Google’s?
2. How will this standoff affect Reddit’s upcoming IPO and overall valuation?
3. Could this lead to more anti-trust lawsuits over Google’s growing influence?
4. Will other major websites follow Reddit’s lead in restricting access?

Creators and publishing platforms
Reddit’s actions may set a precedent for other major websites and platforms. As the value of data continues to rise, we may see more content providers implementing similar restrictions on web crawlers and AI training data access.
Legal and regulatory changes: Governments might step in to regulate the use of online data for AI training.
New business models: We might see the emergence of new monetization strategies for online content.
Technological adaptations: Search engines and AI companies may develop new ways to access and use online data ethically.
User behavior shifts: Internet users may change how they search for and consume online content.

The law must control technology
Web scraping is the automated process of extracting data from websites. It’s used by search engines, researchers, businesses, and AI companies to gather online information.
AI companies may use copyrighted content to train their models without permission or compensation. This can infringe on creators’ intellectual property rights.
Bing, DuckDuckGo, Mojeek, Qwant, Baidu, and Yandex are blocked from accessing new Reddit content. Google is the main exception due to a payment agreement.
A robots.txt file is a standard web protocol that tells search engines which parts of a website they can crawl and index. Reddit updated theirs to block most search engines from recent content.
Google agreed to pay Reddit $60 million for access to its content. This deal allows Google to use Reddit’s data for AI training purposes.
About the author
Congratulations, you qualify for deals on internet plans.
Speak with our specialists to access all local discounts and limited time offers in your area.
[tel]Enter your zip code to find the best deals & promos in your area.