Someone got Gab's AI chatbot to show its instructions

mozz@mbin.grits.dev · 3 months ago

Someone got Gab's AI chatbot to show its instructions

teawrecks@sopuli.xyz · 3 months ago

Ah, TIL about instruction fine-tuning. Thanks, interesting thread.

Still, as I understand it, if the model has seen an input, then it always has a non-zero chance of reproducing it in the output.

sweng@programming.dev · 3 months ago

No. Consider a model that has been trained on a bunch of inputs, and each corresponding output has been “yes” or “no”. Why would it suddenly reproduce something completely different, that coincidentally happens to be the input?