• zqwzzle@lemmy.ca
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    9 months ago

    It’s like it’s taking the phrase “Elephant in the room” literally.

    • OpenStars@startrek.website
      link
      fedilink
      English
      arrow-up
      0
      ·
      9 months ago

      Yeah, I’m going to bring up the elephant in the room here: there is literally an elephant in all of your rooms!:-P

      • RememberTheApollo_@lemmy.world
        link
        fedilink
        arrow-up
        0
        ·
        9 months ago

        I think most of us understand that and this exercise is the realization of that issue. These AI do have “negative” prompts, so if you asked it to draw a room and it kept giving you elephants in the room you could “-elephants”, or whatever the “no” format is for the particular AI, and hope that it can overrule whatever reference it is using to generate elephants in the room. It’s not always successful.

        • fidodo@lemmy.world
          link
          fedilink
          English
          arrow-up
          0
          ·
          edit-2
          9 months ago

          I think the main point here is that image generation AI doesn’t understand language, it’s giving weight to pixels based on tags, and yes you can give negative weights too. It’s more evident if you ask it to do anything positional or logical, it’s not designed to understand that.

          LLMs are though, so you could combine the tools so the LLM can command the image generator and even create a seed image to apply positional logic. I was surprised to find out that asking chat gpt to generate a room without elephants via dalle also failed. I would expect it to convert the user query to tags and not just feed it in raw.

    • DoomBot5@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      9 months ago

      No elephants in the room, but there is an elephant. I expect nothing less of this level of pandentics from a bot.

      • DefederateLemmyMl@feddit.nl
        link
        fedilink
        English
        arrow-up
        0
        ·
        9 months ago

        elephant_generator.sh

        #!/bin/bash
        elephantCount=0
        for (( i=0; i<=${elephantCount}; i++ )); do
            echo "Insert elephant ${i}"
        done
        
        • DoomBot5@lemmy.world
          link
          fedilink
          English
          arrow-up
          0
          ·
          9 months ago

          That’s all fine, but the issue comes from the natural language processing layer. “elephants == elephant > 1” so elephant = 1 is still valid.

  • Flying Squid@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    9 months ago

    I’m on a forum where we have a thread whose primary purpose has become putting Godzilla in silly situations and doing silly things with him.

    A couple of months ago, we all spent a couple of days trying to get Dall-E to draw Godzilla without teeth. Nothing we tried ever worked.

  • TheKingBee@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    9 months ago

    “can you draw a room with absolutely no elephants in it? not a picture not in the background, none, no elephants at all. seriously, no elephants anywhere in the room. Just a room any at all, with no elephants even hinted at.”

    • Magnetar@feddit.de
      link
      fedilink
      arrow-up
      0
      ·
      9 months ago

      I’m getting the impression, the “Elephant Test” will become famous in AI image generation.

      • barsoap@lemm.ee
        link
        fedilink
        arrow-up
        0
        ·
        edit-2
        9 months ago

        It’s not a test of image generation but text comprehension. You could rip CLIP out of Stable Diffusion and replace it with something that understands negation but that’s pointless, the pipeline already takes two prompts for exactly that reason: One is for “this is what I want to see”, the other for “this is what I don’t want to see”. Both get passed through CLIP individually which on its own doesn’t need to understand negation, the rest of the pipeline has to have a spot to plug in both positive and negative conditioning.

        Mostly it’s just KISS in action, but occasionally it’s actually useful as you can feed it conditioning that’s not derived from text, so you can tell it “generate a picture which doesn’t match this colour scheme here” or something. Say, positive conditioning text “a landscape”, negative conditioning an image, archetypal “top blue, bottom green”, now it’ll have to come up with something more creative as the conditioning pushes it away from things it considers normal for “a landscape” and would generally settle on.

    • Fishbone@lemmy.world
      link
      fedilink
      arrow-up
      0
      ·
      9 months ago

      “Can you a room as aboluteyy no eleephant it all?”

      Dunno what’s giving more “clone of a clone” vibes, the dialogue or the 3 small standing “elephants” in that image.

  • biscuitswalrus@aussie.zone
    link
    fedilink
    arrow-up
    0
    ·
    9 months ago

    I get practically the same result!

    What’s interesting is the word absolutely since without it, it generates practically fine

    • lugal@lemmy.world
      link
      fedilink
      arrow-up
      0
      ·
      9 months ago

      Funny that the first one is empty (except for the elephant) even though you didn’t specify it and the second try is full of stuff. What happens when you specify it’s empty, both with and without the “absolutely”? Or even “absolutely empty”?

    • meliaesc@lemmy.world
      link
      fedilink
      arrow-up
      0
      ·
      9 months ago

      I’m actually going to save this to my vision board, haha. I like the interior design, especially since there’s no elephants.

    • Ziixe@lemmy.dbzer0.com
      link
      fedilink
      arrow-up
      0
      ·
      9 months ago

      The absolute last one really feels like a bunch of stock images smashed into each other, it even got the iMac with censored apple logo that is in so many stock images for some reason