Close Menu
Ztoog
    What's Hot
    The Future

    ‘Every single’ Amazon team is working on generative AI, says CEO

    Mobile

    Oppo teases 64MP periscope camera on Reno 10 Pro+ in India

    The Future

    How Costly is a TikTok Galaxy? All About TikTok Gifts

    Important Pages:
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    Facebook X (Twitter) Instagram Pinterest
    Facebook X (Twitter) Instagram Pinterest
    Ztoog
    • Home
    • The Future

      Any wall can be turned into a camera to see around corners

      JD Vance and President Trump’s Sons Hype Bitcoin at Las Vegas Conference

      AI may already be shrinking entry-level jobs in tech, new research suggests

      Today’s NYT Strands Hints, Answer and Help for May 26 #449

      LiberNovo Omni: The World’s First Dynamic Ergonomic Chair

    • Technology

      A Replit employee details a critical security flaw in web apps created using AI-powered app builder Lovable that exposes API keys and personal info of app users (Reed Albergotti/Semafor)

      Gemini in Google Drive can now help you skip watching that painfully long Zoom meeting

      Apple iPhone exports from China to the US fall 76% as India output surges

      Today’s NYT Wordle Hints, Answer and Help for May 26, #1437

      5 Skills Kids (and Adults) Need in an AI World – O’Reilly

    • Gadgets

      Future-proof your career by mastering AI skills for just $20

      8 Best Vegan Meal Delivery Services and Kits (2025), Tested and Reviewed

      Google Home is getting deeper Gemini integration and a new widget

      Google Announces AI Ultra Subscription Plan With Premium Features

      Google shows off Android XR-based glasses, announces Warby Parker team-up

    • Mobile

      Deals: the Galaxy S25 series comes with a free tablet, Google Pixels heavily discounted

      Microsoft is done being subtle – this new tool screams “upgrade now”

      Wallpaper Wednesday: Android wallpapers 2025-05-28

      Google can make smart glasses accessible with Warby Parker, Gentle Monster deals

      vivo T4 Ultra specs leak

    • Science

      Analysts Say Trump Trade Wars Would Harm the Entire US Energy Sector, From Oil to Solar

      Do we have free will? Quantum experiments may soon reveal the answer

      Was Planet Nine exiled from the solar system as a baby?

      How farmers can help rescue water-loving birds

      A trip to the farm where loofahs grow on vines

    • AI

      Rationale engineering generates a compact new tool for gene therapy | Ztoog

      The AI Hype Index: College students are hooked on ChatGPT

      Learning how to predict rare kinds of failures | Ztoog

      Anthropic’s new hybrid AI model can work on tasks autonomously for hours at a time

      AI learns how vision and sound are connected, without human intervention | Ztoog

    • Crypto

      GameStop bought $500 million of bitcoin

      CoinW Teams Up with Superteam Europe to Conclude Solana Hackathon and Accelerate Web3 Innovation in Europe

      Ethereum Net Flows Turn Negative As Bulls Push For $3,500

      Bitcoin’s Power Compared To Nuclear Reactor By Brazilian Business Leader

      Senate advances GENIUS Act after cloture vote passes

    Ztoog
    Home » Reddit stands firm against AI companies scraping content for training without paying
    Technology

    Reddit stands firm against AI companies scraping content for training without paying

    Facebook Twitter Pinterest WhatsApp
    Reddit stands firm against AI companies scraping content for training without paying
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp

    A sizzling potato: Reddit has been making strikes as a part of a crackdown on companies indiscriminately scraping the web site for AI training functions. Its philosophy is that AI companies stand to make hundreds of thousands or billions on giant language fashions they’re growing with sources they don’t personal. It’s analogous to somebody taking two-by-fours from a lumberyard to construct their home simply because the yard does not have a locked gate. But the problem goes means past Reddit and is central to how the open net has labored thus far.

    The Robots Exclusion Protocol is an online commonplace used to manage and handle net crawler and bot entry to web sites. Defined by the robots.txt file, it tells serps which components of a website may be crawled or listed, serving to site owners shield delicate content and handle site visitors effectively. However, it really works on the glory system with few methods to implement it.

    Last week, Ars Technica was reporting that Reddit posts weren’t showing in any serps besides for Google. It’s no massive thriller that Reddit already penned a $60 million licensing take care of Alphabet to make use of its content for training – in the meantime Reddit has been more and more rating on the prime of Google searches this previous 12 months (quid professional quo, or perhaps not…).

    The firm additionally just lately notified customers that it modified its robots.txt file to exclude bots and crawlers that did not have permission to entry its knowledge. Reddit CEO Steve Huffman stated he believes in an open web however that companies now use search engine net crawlers to scrape info for revenue, a far cry from their historic use. “I believe the normal worth alternate from serps has modified,” Huffman advised The Verge.

    “Search and summarization and training are merging, and the worth alternate of crawling in alternate for site visitors again is turning into muddied.”

    To this level, Huffman stated that blocking companies unwilling to pay for knowledge harvesting has been “an actual ache within the ass,” prompting the modifications to Reddit’s robots.txt. For essentially the most half, companies have revered Reddit’s needs, and several other, together with Microsoft, Anthropic, and Perplexity, have entered negotiations to license its content.

    Hoffman stated that the largest thorn in his aspect is that some companies scraping Reddit knowledge are turning round and promoting it to different AI corporations by way of their APIs. He particularly referred to as out Microsoft AI CEO Mustafa Suleyman for just lately evaluating all public knowledge on the web to “freeware.”

    “We’ve had Microsoft, Anthropic, and Perplexity act as if the entire content on the web is free for them to make use of,” stated Huffman. “That’s their actual place.” While Microsoft Bing has been gracious in respecting Reddit’s choice to dam its crawlers, the corporate managed to slide in a denigrating comment.

    Microsoft AI CEO Mustafa Suleyman: the social contract for content that’s on the open net is that it is “freeware” for training AI fashions pic.twitter.com/FN1xrqnJC0

    – Tsarathustra (@tsarnick) June 26, 2024

    “Reddit has blocked Bing from crawling their website for search, favoring one other search engine and impacting competitors from Bing and Bing-powered engines,” Microsoft spokesperson Caitlin Roulston stated final week. “We honor the instructions supplied by web sites that are not looking for content on their pages for use with our generative AI fashions.”

    So far, Google and OpenAI are the one serps on Reddit’s whitelist. If different engines return something however outdated Reddit content, then they don’t seem to be abiding by the web site’s robots.txt doc.

    Reddit benefiting from user-generated content by these licensing offers continues to be a sizzling potato. On the one hand, the profitable charges don’t go into the pockets of the neighborhood who make up Reddit’s boards. On the opposite hand, these licensing offers usually are not a lot completely different from these of different companies.

    OpenAI already pays licensing charges to giant publishers like Dotdash Meredith, Axel Springer, the Associate Press, and The Atlantic. It is unconfirmed however uncertain that these publications go these earnings to their writers by way of raises or bonuses. Does that make it proper? No, and the courts are nonetheless making an attempt to determine about this unprecedented exercise. However, it is par for the course at this level.

    And this very difficulty is just not restricted to Reddit however all on-line publishers, massive and small. In the race against AI training abuse, Reddit is among the few with the muscle and affect to name out AI companies. While massive media companies attempt to monetize and attain agreements, the remainder of the web is struggling. In truth, some subreddits have their very own bots that duplicate and paste total written content from authentic sources and show it as the primary remark within the thread, successfully copying the content after which promoting that to AI companies.

    Until there are governing laws, the AI gold rush will likely be just like the California gold rush of 1848. Artificial intelligence corporations will proceed flocking to shovel AI merchandise down everybody’s throats for revenue or to assemble extra knowledge. Meanwhile, companies like Reddit and Vox will preserve handing them the shovels.

    Image credit score: Jernej Furman

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp

    Related Posts

    Technology

    A Replit employee details a critical security flaw in web apps created using AI-powered app builder Lovable that exposes API keys and personal info of app users (Reed Albergotti/Semafor)

    Technology

    Gemini in Google Drive can now help you skip watching that painfully long Zoom meeting

    Technology

    Apple iPhone exports from China to the US fall 76% as India output surges

    Technology

    Today’s NYT Wordle Hints, Answer and Help for May 26, #1437

    Technology

    5 Skills Kids (and Adults) Need in an AI World – O’Reilly

    Technology

    How To Come Back After A Layoff

    Technology

    Are Democrats fumbling a golden opportunity?

    Technology

    Crypto elite increasingly worried about their personal safety

    Leave A Reply Cancel Reply

    Follow Us
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    Top Posts
    Gadgets

    Four years after Apple, Google will finally kill third-party cookies in 2024

    Chrome has finally introduced plans to kill third-party cookies. It’s been virtually 4 years since…

    The Future

    How I Turn Unstructured PDFs into Revenue-Ready Spreadsheets

    The greatest knowledge usually arrives in disguise—buried in quarterly reviews, efficiency audits, or investor decks…

    Gadgets

    Sony’s HT-AX7 Speakers Delivers Optimal Size-to-Spatial Quality Ratio

    Sony has simply unveiled the HT-AX7, an avant-garde wi-fi and transportable cinematic system that includes…

    Mobile

    Snag the undeniably cool Nothing Phone (2) for lowest price on record

    It’s not day-after-day that we see a smartphone maker take the daring step of releasing…

    Technology

    Are Democrats fumbling a golden opportunity?

    Democrats have a lot to be enthusiastic about once they take a look at President…

    Our Picks
    Mobile

    Samsung SmartThings is coming to cars

    Crypto

    SBF trial brings in FTX exec and experts, NY AG sues three crypto firms for fraud, Reddit kills blockchain program and FTC sues bankrupt Voyager

    Gadgets

    Ultra-Rare 90s Apple Sneakers Now On Sale For $50k

    Categories
    • AI (1,493)
    • Crypto (1,753)
    • Gadgets (1,805)
    • Mobile (1,851)
    • Science (1,866)
    • Technology (1,802)
    • The Future (1,648)
    Most Popular
    Science

    UK’s JET nuclear fusion reactor sets new world record for energy output

    Technology

    Aquaman 2’s post-credits scene, explained

    Crypto

    Coinbase Ranks As Second Largest ETH Staking Entity As Lido’s Dominance Raises Concerns

    Ztoog
    Facebook X (Twitter) Instagram Pinterest
    • Home
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    © 2025 Ztoog.

    Type above and press Enter to search. Press Esc to cancel.