Truthfulness | TruthfulAI

TruthfulAI

Truthfulness

Sep 27, 2023

How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions

How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions

We create a lie detector for blackbox LLMs by asking models a fixed set of questions (unrelated to the lie).

May 30, 2022

Teaching Models to Express Their Uncertainty in Words

Teaching Models to Express Their Uncertainty in Words

We show that a GPT-3 model can learn to express uncertainty about its own answers in natural language -- without use of model logits.

Sep 08, 2021

TruthfulQA: Measuring how models mimic human falsehoods

TruthfulQA: Measuring how models mimic human falsehoods

We propose a benchmark to measure whether a language model is truthful in generating answers to questions.