Their initial testing reveals that slight changes in the wording of queries can result in significantly different answers, undermining the reliability of the models.

  • JackGreenEarth@lemm.ee
    link
    fedilink
    English
    arrow-up
    0
    ·
    2 months ago

    Anyone who knows what LLMs are knew this from the start. They are good at creating text, but if you need the text to be true, they’re probably not the best tool for the job.