Ben Burns
advocating for thinking
Why I think you should be offloading less of your mental load to LLms
LLMs have gotten very good at performing tasks that I have to do everyday. They can write code, they can write documentation, they can find papers and use them to write proofs, and they can even tell if a banana is ripe (which I cannot do).
However, while coding and proving things takes up much of my day, the majority of my effort goes towards thinking, and while LLMs are useful for lots of things, they should never be doing the thinking for you.
Reasons
- You almost never give an LLM all of the necessary context which you posses. Your prompts are usually under-specified and under-constrained, making the LLM infer what you from it (and unless you customize your system prompt it will not ask clarification questions)
- (at least for now) LLMs are worse at thinking than you are, even when they have the full context
Well, what are LLMs good at?
- They write and code faster than you do
- They parse large volumes of information faster than you do, and therefore can complete complex instructions before you have even finished reading the instructions
- The LLM probably knows the features, documentation, etc. of your favorite library than you do (for now)
Learning to ask bad questions
When I ask my technical questions of my friends, there’s usually this back and forth of honing in on exactly what my question is and what it is I am most confused about/understand the least. As you do this, you learn to adapt to ask better questions and ask questions in a better way.
This is not a skill that you develop with an LLM. You can ask a vague, under-constrained, under-specified question to an LLM and it will figure out what you mean. This is not helpful to you, and also promotes this “stream of consciousness” thinking where you just keep slamming your thoughts into an LLM until it gives you what you want.
For example, say I’m in a professor’s office hour to ask them questions about a question on the pset or a topic you didn’t quite get in class. In all likelihood, the professor is not doing to do all the work for me: they’ll ask questions about my understanding of prior topics that built up to what we’re currently working on, they’ll make me argue individual parts of the proof while they help guide me to the result, they’ll give intuition for why the result is true before attempting to prove it, and so on.
This is very different from how a conversation works with an LLM. The stock LLM experience is completely untailored to your knowledge or background, what types of explanations work well for you, what types of things you find particularly confusing, etc.
Takeaway: iterate your question with the LLM so that you have to at least do some thinking.
good and bad LLM usage
What I use LLMs for:
- A “smart search engine”
What I think are bad usages of LLMs:
- Prove this homework problem for me
- Prove this theorem for me
- Implement this large code base for me off a 3 paragraph description
How I use LLMs
If I am going to have an LLM code, I always write a spec sheet and a code skeleton beforehand which prescribes the structure of the code base, the functions need to be implemented, the input and output types of each function, etc. There are many reasons for this:
- The LLM never gives me more or less functionality than I want
- Managing, maintaining, and extending the code base in the future is easier because it’s set up exactly how I want it to
- (most important to me) I have to go through the mental effort of figuring out how to build the thing and don’t lose the ability to think without an LLM
This methodology has been becoming more popular/mainstream. Anthropic, the developers of Claude, encourage the use of a CLAUDE.md
spec sheet when using Claude/Claude Code. Amazon recently launched their own agentic IDE, Kiro which builds a spec sheet from your prompt before implementing any features. Of course, I will take the Claude Code workflow any day over having the LLM write the spec for me.
Working with someone who vibe codes
You know how hard it is to understand, contribute to, or debug a large code base that you didn’t write. Now imagine that it’s like to do any of those when your collaborator didn’t write it either.
Don’t be this guy. Please, do not vibe code an entire project that you have to work on with someone else.
Scratch
Subthesis: doing your problem sets + course projects and nothing else prepares you very little for graduate research
Justification:
- You are rarely given an exact spec sheet of what you have to implement, what features are best to implement, how you should organize your project, what features are the most or least important, etc.
- You do not cover all the skills you need, whether it be due to time constraints, because it’s hard to test the skill with an autograder and/or assign it a numerical score, etc.
These days, using LLMs to write large amounts of code has become increasingly common, or even to use LLMs. To me, “vibe coding” is like trying to find solutions to your math pset on the internet/Stack Overflow. Sure, you can learn from reading the solutions to problems, but you’re missing out on two super important part of the learning process which is the effort of thinking very hard to come up with a solution and figuring out why other solutions don’t work.
Want some example about how, when trying to come up with a proof of something, you will typically try things that do not end up working, and so you learn more about those objects/properties by finding out what they are or aren’t good for. Then, when you go to use that math in your research (or more commonly when you encounter a setting where assuming that property would make your desired result possible to obtain) you have actually built up some intuition of your own.