1 Comment

Rationalism leads to agreement when you have a real Bayesian model. Like if two rationalists are arguing about the correct play for a given poker hand. You can break down your disagreement - do you disagree about the likelihood of the opponent having aces, the likelihood of hitting your flush draw, etc. You can reduce your disagreement to smaller and smaller areas until it is resolved.

The disagreement comes from "model-free" rationalism. Sometimes we try to be rational, but the best tool we have for making a deduction is not a Bayesian model, but writing long essays to try and think through a problem. This "essay-based rationalism" doesn't lead to agreement because there's no consistent way to diagnose disagreement between two essays, like you can between two Bayesian models.

I wish you the best of luck here making safety more empirical. My concern is that it is simply an intractable problem, that nothing useful will come of safety research, and thus it will seem to be a dead end by empiricist standards. Hopefully this is not the case!

Expand full comment