• bbuez@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    5 months ago

    The fallout of image generation will be even more incredible imo. Even if models do become even more capable, training off of post-'21 data will become increasingly polluted and difficult to distinguish as models improve their output, which inevitably leads to model collapse. At least until we have a standardized way of flagging generated images opposed to real ones, but I don’t really like that future.

    Just on a tangent, openai claiming video models will help “AGI” understand the world around it is laughable to me. 3blue1brown released a very informative video on how text transformers work, and in principal all “AI” is at the moment is very clever statistics and lots of matrix multiplication. How our minds process and retain information is by far more complicated, as we don’t fully understand ourselves yet and we are a grand leap away from ever emulating a true mind.

    All that to say is I can’t wait for people to realize: oh hey that is just to try to replace talent in film production coming from silicon valley

    • Scrubbles@poptalk.scrubbles.tech
      link
      fedilink
      English
      arrow-up
      0
      ·
      5 months ago

      Yeah I read one of the papers that talked about this. Essentially putting AGI data into a training set will pollute it, and cause it to just fall apart. Most LLMs especially are going to be a ton of fun as there were absolutely no rules about what to do, and bots and spammers immediately used it everywhere on the internet. And the only solution is to… write a model to detect it. Which then they’ll make models that bypass that, and there will just be no way to keep the dataset clean.

      The hype of AI is warranted - but also way overblown. Hype from actual developers and seeing what it can do when it’s tasked with doing something appropriate? Blown away. Just honestly blown away. However hearing what businesses want to do with it, the crazy shit like “We’ll fire everyone and just let AI do it!” Impossible. At least with the current generation of models. Those people remind me of the crypto bros saying it’s going to revolutionize everything. It might, but you need to actually understand the tech and it’s limitations first.

      • bbuez@lemmy.world
        link
        fedilink
        arrow-up
        0
        ·
        5 months ago

        Building my own training set is something I would certainly want to do eventually. Ive been messing with Mistral Instruct using GPT4ALL and its genuinely impressive how quick my 2060 can hallucinate relatively accurate information, but its also evident of limitations. IE I tell it I do not want to use AWS or another cloud hosting service, it will just return a list of suggested services not including AWS. Most certainly a limit of its training data but still impressive.

        Anyone suggesting to use LLMs to manage people or resources are better off flipping a coin on every thought, more than likely companies who are insistent on it will go belly up soon enough

    • skeptomatic@lemmy.ca
      link
      fedilink
      arrow-up
      0
      ·
      5 months ago

      AIs can be trained to detect AI generated images, so then the race is only whether the AI produced images get better faster than the detector can keep up or not.
      More likely as the technology evolves AIs, like a human, will just train real-time-ish from video taken from it’s camera eyeballs.
      …and then, of course, it will KILL ALL HUMANS.

    • MalReynolds@slrpnk.net
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      5 months ago

      I see this a lot, but do you really think the big players haven’t backed up the pre-22 datasets? Also, synthetic (LLM generated) data is routinely used in fine tuning to good effect, it’s likely that architectures exist that can happily do primary training on synthetic as well.