tuesday, may 6th, 2025 at 10:11 am
386 words

I think I pinpointed a misgiving I have about general collaborative LLM use: that the LLM will never stop and say "I don't know" or "I don't have this mental model, can you help me understand it". That's the fundamental limitation of being trained to plausibly (not correctly) generate text.

I guess it gets fuzzier with the reinforcement learning layered on top, right? But can you really reinforcement train it to say "I don't know"? You can train it to say "I don't know" based on patterns of when it's training data shows people saying "I don't know". But it's against its grain, right?

If I had a collaborator who never said "I don't know" that would be a bad collaborator - I'd feel like I never had a steady foundation with them. Sometimes this does kind of happen because people feel embarrassed not to know things - they may commit code they don't understand and not want to admit it. But I feel like a key part of me getting to know someone and working with them is getting past that to where we can talk about what doesn't make sense - because a solid, shared mental model benefits us both.

I definitely use LLMs to speed up my development, but mostly in the form of autosuggest, which to me feels like the most 'honest' or 'with the grain' format for it. I'll also have it generate more generic components or bash scripts or database logic. But even then I feel like I really want to keep thinking of it as a form of 'codegen', of 'predictive text', and not a collaborator.

I don't know the consequences of all this - maybe accuracy crosses a threshold where the hallucination tradeoff is negligible. Maybe grounding in documentation helps a lot. And possibility you can do some chaining of an evaluator model... Ideally I think you'd get a confidence or 'novelty' score from the model itself, that you'd learn to factor into your intuition about how to treat what it's generated, but I don't see a lot of progress or talk about that lately.

Maybe this is just an argument that there's some metaphor beyond chat is called for. Something that signals its going to 'run with idea' or 'play it out', not 'answer your question definitively'.