• Melvin_Ferd@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      1 month ago

      No it won’t. Media already laid the groundwork for people to hate on AI. Now they will keep focus on areas where when you read it we all come to the same common sense legislation solution. Then will come a bill to strip us of more things that made the internet awesome and we will cheer. Web scrapping and data sharing can fuck off. Pirates sent to North Korean prison camps. Sharing accounts with family, you’re flagged for an audit. Nintendo modders, more like criminals.

  • dinckel@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 month ago

    It’s illegal when a regular person steals something, but it’s innovation and courage, when a huge corporation steals something. Interesting how that works

    • bean@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 month ago

      Honestly it’s fucking angering. So much regulation and geo-restrictions and licensing schemes… but it’s cool that there are data brokers, and shit like this. On top of it all Chrome screwing us with manifest v3 and killing ad blocking on chrome. It’s already in canary build.

      WHAT THE FUCK IS WRONG WITH THIS SPECIES?!

      • Mojave@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 month ago

        Data, network bandwidth, and CPU/Processing time from essentially every website in the world, and when you’re paying for cloud power to run your website the cost of webscrapers running a train on your digital asshole adds up QUICK.

        It’s why normal human being people get sued to shit for webscraping data from certain companies who care. But companies don’t get sued because go fuck yourself. Kill bytedance.

    • Grimy@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      arrow-down
      1
      ·
      1 month ago

      Any regular person can scrape and use public data for AI use, it’s not illegal for companies or individuals and it shouldn’t be.

    • Guy Dudeman@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      arrow-down
      1
      ·
      1 month ago

      Google’s mission statement was originally something about controlling the world’s data. If Google has competition, that might be a good thing?

  • zod000@lemmy.ml
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 month ago

    We’ve had this thing hammering our servers. The scraper uses randomized user-agents browser/OS combinations and comes from a number of distinct IP ranges in different datacenters around the world, but all the IPs track back to Bytedance.

    • UnderpantsWeevil@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 month ago

      Wouldn’t be surprised if they’re just cashing out while TikTok is still public in the US. One last desperate grab at value-add for the parent company before the shut down.

      Also a great way to burn the infrastructure for subsequent use. After this, you can guarantee every data security company is going to add the TikTok servers to their firewalls and blacklists. So the American company that tries to harvest the property is going to be tripping over these legacy bullwarks for years after.

      • Maggoty@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 month ago

        This has nothing to do with Tik Tok other than ByteDance being a shareholder in Tik Tok

  • GnuLinuxDude@lemmy.ml
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    1 month ago

    As for what ByteDance plans to do with a new LLM, a person familiar with the company’s ambitions said one goal has to do with the search function for TikTok.

    Last week, TikTok released an update to its current search function focused on [keywords for ads], basically allowing advertisers to search in real time for words that are trending on TikTok. It allows marketers to build an ad with relevant keywords that would ostensibly help the ad show up on the screens of more users.

    “Given the audience and the amount of use, TikTok with a search environment that is a completely biddable space with keywords and topics, that would be very interesting to a lot of people spending a ton of money with Google right now,” the person said.

    A dark vision just flashed in my mind. And I am certain this is what will happen. AI-generated ads done in real time based on the latest “trending” thing. Presented to users basically as soon as the topic has the slightest amount of “trend”.

    Just emitting untold amounts of CO2 to show you generated ads in near real time.

  • BlackEco@lemmy.blackeco.com
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 month ago

    Also it doesn’t respect robots.txt (the file that tells bots whether or not a given page can be accessed) unlike most AI scrapping bots.

  • jagged_circle@feddit.nl
    link
    fedilink
    English
    arrow-up
    0
    arrow-down
    1
    ·
    edit-2
    1 month ago

    This is fine. I support archiving the Internet.

    It kinda drives me crazy how normalized anti-scraping rhetoric is. There is nothing wrong with (rate limited) scraping

    The only bots we need to worry about are the ones that POST, not the ones that GET

    • purrtastic@lemmy.nz
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 month ago

      It’s not fine. They are not archiving the internet.

      I had to ban their user agent after very aggressive scraping that would have taken down our servers. Fuck this shitty behaviour.

    • WhyJiffie@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 month ago

      this is neither archiving, nor ratelimited, if the AI training purpose and the 25 times faster scraping than a large company did not make it obvious

    • Echo Dot@feddit.uk
      link
      fedilink
      English
      arrow-up
      0
      arrow-down
      1
      ·
      1 month ago

      People like to act as if archiving has never been a thing until about a year ago at which point it was suddenly invented and is now a threat in some nebulous way.

      • hamsterkill@lemmy.sdf.org
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 month ago

        It’s not that it’s a threat, it’s that there’s a difference between archiving for preservation and crawling other people’s content for the purpose of making money off it (in a way that does not benefit the content creator).

        • archomrade [he/him]@midwest.social
          link
          fedilink
          English
          arrow-up
          0
          arrow-down
          1
          ·
          1 month ago

          crawling other people’s content for the purpose of making money off it (in a way that does not benefit the content creator).

          You’re describing capitalism there, bud

      • finitebanjo@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 month ago

        If a foreign Dictatorship’s military op wants to know every facet of your life, then you can be damn sure it’s a threat.