• ℍ𝕂-𝟞𝟝@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    0
    ·
    7 days ago

    AI does not triple traffic. It’s a completely irrational statement to make.

    Multiple testimonials from people who host sites say they do. Multiple Lemmy instances also supported this claim.

    I would bet that the number of requests per year of s resource by an AI scrapper is on the dozens at most.

    You obviously don’t know much about hosting a public server. Try dozens per second.

    There is a booming startup industry all over the world training AI, and scraping data to sell to companies training AI. It’s not just Microsoft, Facebook and Twitter doing it, but also Chinese companies trying to compete. Also companies not developing public models, but models for internal use. They all use public cloud IPs, so the traffic is coming from all over incessantly.

    Using as much energy as a available per scrapping doesn’t even make physical sense. What does that sentence even mean?

    It means that Microsoft buys a server for scraping, they are going to be running it 24/7, with the CPU/network maxed out, maximum power use, to get as much data as they can. If the server can scrape 100 sites per minute, it will scrape 100 sites. If it can scrape 1000, it will scrape 1000, and if it can do 10, it will do 10.

    It will not stop scraping ever, as it is the equivalent of shutting down a production line. Everyone always uses their scrapers as much as they can. Ironically, increasing the cost of scraping would result in less energy consumed in total, since it would force companies to work more “smart” and less “hard” at scraping and training AI.

    Oh, and it’s S-C-R-A-P-I-N-G, not scrapping. It comes from the word “scrape”, meaning to remove the surface from an object using a sharp instrument, not “scrap”, which means to take something apart for its components.

    • daniskarma@lemmy.dbzer0.com
      link
      fedilink
      arrow-up
      0
      arrow-down
      1
      ·
      7 days ago

      I’m not native English speaker. So I would apologize if there’s bad English in my response. And would thank any corrections.

      That being said I do host public services, before and after AI was a thing. And I have asked many of these people who claim “we are under AI bot attacks” how are they able to differentiate when a request is from a AI scrapper or just any other scrapper and there was no satisfying answer.