How Googlers cracked an SF rival's tech model with a single word | A research team from the tech giant got ChatGPT to spit out its private training data

silence7@slrpnk.net · 1 year ago

How Googlers cracked an SF rival's tech model with a single word | A research team from the tech giant got ChatGPT to spit out its private training data

eating3645@lemmy.world · 1 year ago

I’m not really following you but I think we might be on similar paths. I’m just shooting in absolute darkness so don’t hold much weight to my guess.

What makes transformers brilliant is the attention mechanism. That is brilliant in turn because it’s dynamic, depending on your query (also some other stuff). This allows the transformer to be able to distinguish between bat and bat, the animal and the stick.

You know what I bet they didn’t do in testing or training? A nonsensical query that contains thousands of one word, repeating.

So my guess is simply that this query took the model so far out of its training space that the model weights have no ability to control the output in a reasonable way.

As for why it would output training data and not random nonsense? That’s a weak point in my understanding and I can only say “luck,” which is, of course, a way of saying I have no clue.

How Googlers cracked an SF rival's tech model with a single word | A research team from the tech giant got ChatGPT to spit out its private training data

How Googlers cracked an SF rival's tech model with a single word | A research team from the tech giant got ChatGPT to spit out its private training data

How Googlers cracked OpenAI's ChatGPT with a single word