I’m pulling the “twitter is a microblog” rule even though twitter is pretty mega now, hope that’s ok.

  • turdas@suppo.fi
    link
    fedilink
    English
    arrow-up
    0
    ·
    2 months ago

    OK, sounds like we broadly agree then.

    But as you can see in the paper I linked, ELIZA passes the Turing test in their experiment about 20% of the time (that is to say, it doesn’t pass; passing is 50% in this test) whereas the best LLMs pass about 70% of the time (that is to say, they are significantly more convincing at being human than real humans).

    • Nalivai@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      2 months ago

      That 20% figure is just a clear indication how shit people are at conducting such a test, and that was basically my original point. 2 in 10 times people were convinced by a particularly echoey room.

      • Fedegenerate@fedinsfw.app
        link
        fedilink
        English
        arrow-up
        0
        ·
        edit-2
        2 months ago

        Turing test can be reliably passed by a bot that repeats last part of the previous sentence with a question mark at the end […]

        If an LLM is correct 2 in 10 times, would you call it “reliably correct”?

        • Nalivai@lemmy.world
          link
          fedilink
          English
          arrow-up
          0
          ·
          2 months ago

          If a person murders people only two days out of 10, they’re a murderer, in order to not be a murderer they need to never do that.
          Reliably correct is when you’re correct always. Demonstrably incorrect is when you’re incorrect even sometimes.