Close Menu
Ztoog
    What's Hot
    Technology

    Reliance’s financial services unit to offer insurance, merchant lending

    Mobile

    The iPhone 15 Pro Max is the most competitive Apple flagship in years

    Gadgets

    Safer Roads: Aurora Labs’ AI Meets Infineon’s AURIX TC4x MCUs

    Important Pages:
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    Facebook X (Twitter) Instagram Pinterest
    Facebook X (Twitter) Instagram Pinterest
    Ztoog
    • Home
    • The Future

      How I Turn Unstructured PDFs into Revenue-Ready Spreadsheets

      Is it the best tool for 2025?

      The clocks that helped define time from London’s Royal Observatory

      Summer Movies Are Here, and So Are the New Popcorn Buckets

      India-Pak conflict: Pak appoints ISI chief, appointment comes in backdrop of the Pahalgam attack

    • Technology

      Ensure Hard Work Is Recognized With These 3 Steps

      Cicada map 2025: Where will Brood XIV cicadas emerge this spring?

      Is Duolingo the face of an AI jobs crisis?

      The US DOD transfers its AI-based Open Price Exploration for National Security program to nonprofit Critical Minerals Forum to boost Western supply deals (Ernest Scheyder/Reuters)

      The more Google kills Fitbit, the more I want a Fitbit Sense 3

    • Gadgets

      Apple plans to split iPhone 18 launch into two phases in 2026

      Upgrade your desk to Starfleet status with this $95 USB-C hub

      37 Best Graduation Gift Ideas (2025): For College Grads

      Backblaze responds to claims of “sham accounting,” customer backups at risk

      Snapdragon X Plus Could Bring Faster, More Powerful Chromebooks

    • Mobile

      What are people doing with those free T-Mobile lines? Way more than you’d expect

      Samsung doesn’t want budget Galaxy phones to use exclusive AI features

      COROS’s charging adapter is a neat solution to the smartwatch charging cable problem

      Fortnite said to return to the US iOS App Store next week following court verdict

      Chinese tech icon is about to raise the stakes in a battle with US chipmaker over AI processors

    • Science

      Trump administration cuts off all future federal funding to Harvard

      Does kissing spread gluten? New research offers a clue.

      Why Balcony Solar Panels Haven’t Taken Off in the US

      ‘Dark photon’ theory of light aims to tear up a century of physics

      Signs of alien life on exoplanet K2-18b may just be statistical noise

    • AI

      How to build a better AI benchmark

      Q&A: A roadmap for revolutionizing health care through data-driven innovation | Ztoog

      This data set helps researchers spot harmful stereotypes in LLMs

      Making AI models more trustworthy for high-stakes settings | Ztoog

      The AI Hype Index: AI agent cyberattacks, racing robots, and musical models

    • Crypto

      ‘The Big Short’ Coming For Bitcoin? Why BTC Will Clear $110,000

      Bitcoin Holds Above $95K Despite Weak Blockchain Activity — Analytics Firm Explains Why

      eToro eyes US IPO launch as early as next week amid easing concerns over Trump’s tariffs

      Cardano ‘Looks Dope,’ Analyst Predicts Big Move Soon

      Speak at Ztoog Disrupt 2025: Applications now open

    Ztoog
    Home » Reddit stands firm against AI companies scraping content for training without paying
    Technology

    Reddit stands firm against AI companies scraping content for training without paying

    Facebook Twitter Pinterest WhatsApp
    Reddit stands firm against AI companies scraping content for training without paying
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp

    A sizzling potato: Reddit has been making strikes as a part of a crackdown on companies indiscriminately scraping the web site for AI training functions. Its philosophy is that AI companies stand to make hundreds of thousands or billions on giant language fashions they’re growing with sources they don’t personal. It’s analogous to somebody taking two-by-fours from a lumberyard to construct their home simply because the yard does not have a locked gate. But the problem goes means past Reddit and is central to how the open net has labored thus far.

    The Robots Exclusion Protocol is an online commonplace used to manage and handle net crawler and bot entry to web sites. Defined by the robots.txt file, it tells serps which components of a website may be crawled or listed, serving to site owners shield delicate content and handle site visitors effectively. However, it really works on the glory system with few methods to implement it.

    Last week, Ars Technica was reporting that Reddit posts weren’t showing in any serps besides for Google. It’s no massive thriller that Reddit already penned a $60 million licensing take care of Alphabet to make use of its content for training – in the meantime Reddit has been more and more rating on the prime of Google searches this previous 12 months (quid professional quo, or perhaps not…).

    The firm additionally just lately notified customers that it modified its robots.txt file to exclude bots and crawlers that did not have permission to entry its knowledge. Reddit CEO Steve Huffman stated he believes in an open web however that companies now use search engine net crawlers to scrape info for revenue, a far cry from their historic use. “I believe the normal worth alternate from serps has modified,” Huffman advised The Verge.

    “Search and summarization and training are merging, and the worth alternate of crawling in alternate for site visitors again is turning into muddied.”

    To this level, Huffman stated that blocking companies unwilling to pay for knowledge harvesting has been “an actual ache within the ass,” prompting the modifications to Reddit’s robots.txt. For essentially the most half, companies have revered Reddit’s needs, and several other, together with Microsoft, Anthropic, and Perplexity, have entered negotiations to license its content.

    Hoffman stated that the largest thorn in his aspect is that some companies scraping Reddit knowledge are turning round and promoting it to different AI corporations by way of their APIs. He particularly referred to as out Microsoft AI CEO Mustafa Suleyman for just lately evaluating all public knowledge on the web to “freeware.”

    “We’ve had Microsoft, Anthropic, and Perplexity act as if the entire content on the web is free for them to make use of,” stated Huffman. “That’s their actual place.” While Microsoft Bing has been gracious in respecting Reddit’s choice to dam its crawlers, the corporate managed to slide in a denigrating comment.

    Microsoft AI CEO Mustafa Suleyman: the social contract for content that’s on the open net is that it is “freeware” for training AI fashions pic.twitter.com/FN1xrqnJC0

    – Tsarathustra (@tsarnick) June 26, 2024

    “Reddit has blocked Bing from crawling their website for search, favoring one other search engine and impacting competitors from Bing and Bing-powered engines,” Microsoft spokesperson Caitlin Roulston stated final week. “We honor the instructions supplied by web sites that are not looking for content on their pages for use with our generative AI fashions.”

    So far, Google and OpenAI are the one serps on Reddit’s whitelist. If different engines return something however outdated Reddit content, then they don’t seem to be abiding by the web site’s robots.txt doc.

    Reddit benefiting from user-generated content by these licensing offers continues to be a sizzling potato. On the one hand, the profitable charges don’t go into the pockets of the neighborhood who make up Reddit’s boards. On the opposite hand, these licensing offers usually are not a lot completely different from these of different companies.

    OpenAI already pays licensing charges to giant publishers like Dotdash Meredith, Axel Springer, the Associate Press, and The Atlantic. It is unconfirmed however uncertain that these publications go these earnings to their writers by way of raises or bonuses. Does that make it proper? No, and the courts are nonetheless making an attempt to determine about this unprecedented exercise. However, it is par for the course at this level.

    And this very difficulty is just not restricted to Reddit however all on-line publishers, massive and small. In the race against AI training abuse, Reddit is among the few with the muscle and affect to name out AI companies. While massive media companies attempt to monetize and attain agreements, the remainder of the web is struggling. In truth, some subreddits have their very own bots that duplicate and paste total written content from authentic sources and show it as the primary remark within the thread, successfully copying the content after which promoting that to AI companies.

    Until there are governing laws, the AI gold rush will likely be just like the California gold rush of 1848. Artificial intelligence corporations will proceed flocking to shovel AI merchandise down everybody’s throats for revenue or to assemble extra knowledge. Meanwhile, companies like Reddit and Vox will preserve handing them the shovels.

    Image credit score: Jernej Furman

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp

    Related Posts

    Technology

    Ensure Hard Work Is Recognized With These 3 Steps

    Technology

    Cicada map 2025: Where will Brood XIV cicadas emerge this spring?

    Crypto

    Bitcoin Holds Above $95K Despite Weak Blockchain Activity — Analytics Firm Explains Why

    Technology

    Is Duolingo the face of an AI jobs crisis?

    Technology

    The US DOD transfers its AI-based Open Price Exploration for National Security program to nonprofit Critical Minerals Forum to boost Western supply deals (Ernest Scheyder/Reuters)

    Technology

    The more Google kills Fitbit, the more I want a Fitbit Sense 3

    Technology

    Sorry Shoppers, Amazon Says Tariff Cost Feature ‘Is Not Going to Happen’

    Technology

    Vibe Coding, Vibe Checking, and Vibe Blogging – O’Reilly

    Leave A Reply Cancel Reply

    Follow Us
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    Top Posts
    Technology

    Looming Retraction Casts Shadow Over Ranga Dias and Study of Superconductors

    A serious physics journal is retracting a two-year-old scientific paper that described the transformations of…

    Technology

    What is Qualcomm’s QCM6490 processor?

    The self-repairable Fairphone 5 has given us a lot to debate, whether or not we’re…

    Technology

    Amazon’s Autumn Hardware Event, a Futuristic Gadget Extravaganza

    Today, September 20, 2023, Amazon launched a powerful lineup of over a dozen new merchandise…

    Crypto

    XDC Network Dominates Weekend Top 100 Roster With 50% Rally

    The value of the XDC Network token, XDC, has elevated for a complete of 5…

    Technology

    Why Monday.com decided to build its new database instead of buying one

    Monday.com launched extra than a decade in the past wanting to assist corporations build a…

    Our Picks
    Science

    Quantum randomness of empty space can be controlled with a laser

    Gadgets

    Luxury On The Waves: Lexus Unveils The LY 680 Yacht

    Gadgets

    15 Best Android Phones (2023): Unlocked, Cheap, Foldable

    Categories
    • AI (1,482)
    • Crypto (1,744)
    • Gadgets (1,795)
    • Mobile (1,838)
    • Science (1,852)
    • Technology (1,789)
    • The Future (1,635)
    Most Popular
    Crypto

    Will Support Levels Halt Decline Above $0.27?

    Mobile

    Sony Xperia 5 V is now receiving the update to Android 14

    Gadgets

    The best solar landscape lights of 2023

    Ztoog
    Facebook X (Twitter) Instagram Pinterest
    • Home
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    © 2025 Ztoog.

    Type above and press Enter to search. Press Esc to cancel.