LLMs in Software Engineering - What Experienced Practitioners Actually See · Raphael Poss

A guide for newcomers to software engineering, and for anyone curious about how LLMs are changing it. Built from interviews with senior professionals from the tech industry.

Opening

If you are reading this, you probably have mixed feelings about large language models (LLMs). You may have used one and been impressed. You may have used one and been unnerved. Both reactions are reasonable.

About me: I have worked as a software engineer previously, I mentor people entering the field, and I use LLM tools daily. In early 2026, I designed a survey about real-world LLM usage in software engineering and sent it to seven experienced practitioners, all working in the tech industry but with different backgrounds and current places of employment. Each of them also sat down with me for an individual interview. Their careers range from 3 to over 20 years, from startups to FAANG. I deliberately chose mid-to-late career people because I wanted depth of judgment from people who have watched this industry change before.

They disagreed with each other. Often sharply. That turned out to be the most useful part.

This guide is built from those eight survey responses and interviews, including my own. The names used throughout are pseudonyms. The practitioners’ identities have been hidden to protect their privacy and at the request of their employers. The sample is small and curated; I will not pretend otherwise.

The practitioners disagree on how much to trust LLM output, when newcomers should start using the tools, and what the profession will look like in ten years. I am not going to resolve those disagreements for you. What I can give you is the questions that came up when experienced people talked honestly about what is changing. If you carry those questions into your own work, you will be better equipped than most.

Let us start with the most basic one.

1. What are you actually working with?

Before we talk about productivity, careers, or team dynamics, we need to settle something more basic: what is this thing you are using when you open an LLM-powered tool and start typing?

The short answer is that you are working with a next-token predictor. The model has been trained on enormous quantities of text from the internet, and when you give it a prompt, it generates the most statistically probable continuation based on that training data. It does not “understand”. It matches patterns from its training data.

Every practitioner I spoke with uses these tools daily. Seven of the eight use Anthropic’s Claude models frequently or always. Two respondents, Kai and Samir, use OpenAI products frequently, though only Samir is also a frequent Claude user; Kai works primarily through Copilot, which routes to multiple providers. One respondent uses Google’s models frequently, and another uses them occasionally. The tool landscape at the time of survey (early 2026, shortly after the release of Opus 4.6) converges heavily on Claude for software engineering work, but it is shifting constantly. Specific product names will be dated within months. What matters is the underlying mechanism, because that does not change as quickly.

The technology only reached sufficient maturity for serious professional use recently. Two of the practitioners, Ben and Samir, specifically recommend Andrej Karpathy’s materials on LLM internals as essential background. Luca and Sam recommend Simon Willison. Dani and Noel both emphasize learning by doing: Dani’s recommendation is blunt (“no reading, just practice”), while Noel advises people to practice on their own and discover patterns through experimentation. The message is consistent: understand how the machine works before you trust it with your work.

Ben, a mid-career tech lead at a small-team startup, put the baseline plainly: “There is no magic involved. LLMs are next-word predictors based on online content. People have made arbitrary choices about fine-tuning the output.” He drew an analogy to the compiler revolution of the 1970s. You do not need to understand register allocation, he said, but you do need to know about abstract syntax trees and the relative costs of accessing memory at different levels of the hierarchy. That level of understanding lets you do high-level pattern matching: recognizing “this is a state machine,” for instance, and knowing what that implies for testing and determinism. LLMs, he observed, do not yet handle that kind of structural reasoning well.

The practical consequence of the mechanism is something I have seen repeatedly in my own work: the more unusual a project is, the more the LLM will try to steer you away into a more common area. Truly novel projects do not do well with LLMs. This follows directly from training-data dependency. If your task closely resembles patterns the model has seen millions of times, the output will be competent. If your task is genuinely new, the output will pull you toward the familiar whether you want it or not.

How much understanding of the internals is enough? That question is not fully settled. Ben’s compiler analogy suggests a useful floor: you need enough to reason about the tool’s behavior at a high level, not enough to rebuild it. But the floor is firm. No magic, next-word prediction, training-data-dependent. If you carry only those three ideas into the rest of this guide, you will make better decisions than most people who use these tools every day.

Diffs: The line-by-line comparison showing what changed in a file. When someone says “look at the diffs,” they mean read the actual code changes rather than trusting a summary.
Sub-agent orchestration: A setup where a primary AI agent delegates specific tasks to smaller, independent agents that each handle one piece of work and report back.
Rule file / cursorfile: A configuration file that tells an LLM-powered coding tool how to behave: what standards to follow, what patterns to use, what to avoid. Different tools call these “rule files,” “skill files,” or “cursorfiles.”
MCP (Model Context Protocol): A standard that lets LLM tools connect to external data sources and services, so the model can access information beyond what is in its training data.
Vibe coding: Using LLMs to generate code with minimal human review, relying on intuition rather than careful inspection. Generally used as a warning rather than a recommendation.
Hallucination: When an LLM generates output that is plausible-sounding but factually wrong: a function that does not exist, a library that was never published, a confident explanation of something it made up.

2. Does it actually make you faster?

The headline claim is “10x productivity.” The data from the practitioners I talked to says something more complicated.

I asked each person to rate the impact of LLMs on their time, on a scale where 1 means significant time savings, 3 means no change, and 5 means significant extra time spent. The median was 3. Three respondents reported no change. Two reported some time savings. One reported significant time savings. And one reported spending more time than before. The range ran from 1 to 4.

Most people, then, either save time or break even. But the experience is uneven, and where you land depends heavily on what kind of work you do.

Where gains are consistent, they are consistent across the board. All 8 respondents report noticeable improvement in ideation, refactoring, and support work like triaging issues and collecting data. Test code is nearly as consistent, at 7 of 8, though the experience is not uniform: one respondent found that efficiency and test coverage improved while test quality arguably declined, a distinction worth keeping in mind. Run-time data mining (log analysis, trace analysis) comes in at 6 of 8. These are tasks that are repetitive, well-patterned, and where a competent first draft saves the human from a cold start. But only 4 of 8 report improvement in initial product delivery (actually shipping a finished feature). Only 3 of 8 report improvement in performance work like profiling and benchmarking. The gains are concentrated in specific task categories.

The tool landscape reflects this. The eight respondents use Copilot, Claude Code, Cursor, and OpenCode. 6 of 8 use sub-agent orchestration for one-shot delegation. One, Luca, does not use orchestration at all. No single tool is dominant, and none is prescribed here. The tooling matters less than understanding which tasks benefit and which do not.

Luca, a late-career engineer who sets technical standards for a large organization, reported the largest time savings in the group: 1 out of 5. He also scored 7 out of 10 on enjoyment, tied for the highest with Noel and me. His interview explains where those savings come from. “The biggest change is a reduction in disenjoyment rather than an increase in enjoyment.” Triage rotation is much less draining. The grind part of research is gone. The amount of work remains similar, but the effort, the cognitive weight of it, has been reduced. The time savings are real, but they come from eliminating low-value work, not from doing the same work faster.

Samir, a late-career engineer, is the outlier in the other direction. He scored a 4, spending more time than before. His interview explains why: there is a leadership push to integrate AI into workflows, but there is not yet enough infrastructure to make it work well. The team is paying overhead to figure things out. His enjoyment at work reflects this: 3 out of 10, compared to 6 out of 10 for personal projects where he controls the tools and the pace. 5 of the 8 practitioners work at organizations where LLM use is mandated. Samir’s experience suggests that mandated adoption without adequate infrastructure can cost time rather than save it.

The enjoyment numbers add context. The median is 6.5 out of 10, with a range of 5 to 7. Most people feel moderately positive. But Samir’s sharp split between work and personal enjoyment (3 versus 6) shows that how an organization deploys these tools matters as much as the tools themselves.

If you are entering this profession expecting a revolution in speed, recalibrate. Most practitioners report modest time savings or no change, concentrated in specific task types. The more useful question is what skill separates the people who get value from those who do not.

3. How do you tell it what to do?

If LLMs write the code, what does the human need to be good at? The answer from every practitioner I spoke with is some version of the same two things: directing and reviewing. This section is about the first. Reviewing gets its own treatment next.

Directing means giving the LLM enough context, constraint, and structure to produce useful output. Think of it less as “prompting” and more as managing a fast but unreliable junior employee who never pushes back. You have to know what you want, break it into pieces the tool can handle, and encode your standards so the tool can follow them without being reminded every time.

The practitioners who get the most from these tools invest heavily in that encoding. Luca’s entire approach to LLM-assisted work revolves around rule files (configuration documents that tell the tool what standards to follow, what patterns to use, and what mistakes to avoid). He publishes his rule files openly and treats them as the core artifact of his workflow.

During our conversation, Luca described onboarding a junior who had completed a small-scope task manually and then proposed extending it. A senior engineer performed all the automation in a single session. But the teaching moment was not the finished code. It was the transcript. Luca showed the junior the session: the rule files, the prompts, the encoded standards. “That’s where the money is,” he said. He considers this possibly the best training a junior developer can get: studying not the output, but the input. Knowledge transfer is shifting from pair programming and code walkthroughs to the artifacts of LLM interaction: rule files, prompt histories, encoded standards.

Consider two developers asked to add rate limiting to an API endpoint. A less experienced developer might type: Add rate limiting to the /api/orders endpoint. The LLM will produce something, probably a middleware using an in-memory counter with a hardcoded threshold. It will work in a single-server test. It will fail in production where multiple instances share traffic, it will not log anything useful, and it will use defaults that do not match the application’s actual load profile.

A more experienced developer might write: Add rate limiting to POST /api/orders. Use the existing Redis connection for distributed counting. Limit to 100 requests per minute per authenticated user, keyed by user ID from the JWT. Return 429 with a Retry-After header. Log blocked requests at warn level with the user ID and current count. Add unit tests for the counter logic and an integration test that verifies the 429 response. The output from this prompt will not be perfect either. It never is. But it will be wrong in smaller, more catchable ways, because the human did the architectural thinking and left the LLM to handle implementation.

The gap between those two prompts comes from knowing the system: how it is deployed, what already exists, what failure looks like in production. That knowledge is exactly what less experienced developers have not yet built.

Dani, a mid-career engineer with eight years of experience, identified “not knowing how to guide the agent” as a primary obstacle for junior developers. Juniors do not write bad prompts because they lack prompting skill. They write bad prompts because they do not yet know enough about the system.

Dani put the obstacle plainly in his survey: junior engineers lack the experience to guide an agent effectively. It is, he noted, similar to the problem junior engineers have without an agent: they do not yet know what to ask for. The tool does not close that gap. It widens it, because the tool will produce confident-looking output regardless of whether the input was good.

Problem decomposition (breaking a large task into pieces an LLM can handle) emerges as the skill that separates productive use from frustrating use. Luca’s automation session was not a single prompt. It was a structured sequence of directed steps. Samir’s review practice (described in the next section) depends on understanding the task well enough to know what the output should look like. The common thread: the human does the thinking about what needs to happen and why. The LLM does the typing.

4. How do you catch what it got wrong?

The other half of the skill is review. And on review, the practitioners I spoke with hold strong opinions that do not always agree.

The data makes the consensus visible where it exists. When specifications are unclear, 6 of 8 respondents always review the output before submission, and the remaining 2 review most of the time. This is the tightest consensus in the entire survey. Nobody relaxes scrutiny when they are unsure what the code should do. Three of the eight (Luca, Noel, and I) apply more scrutiny to unclear-spec work than we would to equivalent human-written code. Nobody applies less.

But the consensus frays as the stakes appear to drop. For small or repetitive changes, only 3 of 8 always review. I rarely review them myself (I am the outlier there). The spread is even wider for complex changes where the spec is clear: Ben rarely reviews those, while Samir always does. The practitioners agree on the principle that review matters. They disagree, sometimes sharply, on where to draw the line.

Four of the eight flag the same obstacle for junior developers: they cannot assess output quality. They cannot tell a hallucination from truth. Ben put it bluntly: “It’s all LGTM in their eyes.” Two others, Samir and I, identified a related problem: juniors do not think about the problem themselves. They prompt, accept, and move on. The core issue is the same in both cases. Review requires understanding, and understanding is exactly what newer developers have not yet built.

Samir represents the most rigorous position in the group. He reviews everything, always, across all categories. He feels personally responsible for the output and will always review it thoroughly because it has his name on it. His review depth did not change with LLMs because it was already thorough. But he added an interesting wrinkle: he now uses additional agents to review his own work, and on multiple occasions agents have found correctness bugs that he missed himself. He could imagine a future where he trusts agents more than himself to find bugs or correctness issues. But not yet.

Between these poles (Samir’s thoroughness and Ben’s deliberate selectivity) the rest of us find our own positions. The standards are still forming. Context shapes where you draw the line: the stakes of the code, your experience, the maturity of the codebase.

Sam, a co-founder working in AI agent security, noted a review asymmetry that none of the other practitioners raised. When reviewing code that a colleague produced with LLM assistance, he finds it easier: the structure is predictable, the patterns familiar. Reviewing code written entirely by hand has become comparatively harder. The practical implication is real for any team where LLM adoption is uneven: reviewers who work primarily with LLM-assisted code may find themselves less well-calibrated to evaluate work from colleagues who did not use the tools.

The deeper issue beneath all of this is accountability. The code has your name on it. You committed it. You deployed it. If it breaks in production at 3 a.m., you are the one who gets the call. LLMs do not bear responsibility. People do. As I wrote in my own survey response: people cannot let LLMs take decisions in their name without deep understanding of the consequences. If something bad happens, the human user is on the hook. That reality does not change because the tool is convenient. Reviewing code is part of it. The harder part is accepting that the code is yours, regardless of who or what wrote it.

5. When should you put it down?

Every practitioner I spoke with uses LLMs. Every one of them also has boundaries: tasks, contexts, or career stages where they choose not to use the tool, or where they advise others not to.

The clearest boundary line runs through early career development. I asked each respondent whether they push mentees toward learning about LLMs right away, gradually, or not at all. The answers split 4 to 3 to 1: four said right away (Kai, Luca, Dani, Sam), three said gradually (Ben, Noel, and I), and one said no.

Samir holds the strongest “put it down” position. In his survey he wrote: “For first-year staff, I would prefer them not to use LLMs at all to generate code. The code should be done manually. Folk need the engagement with the low-level details to build up their skill, knowledge, and intuition.” But his position is not absolute: he explicitly allows LLM use for brainstorming and intellectual sparring, for exploring codebases and problem domains, and for code self-review, even for first-year staff. He draws the line at code generation. He drew an analogy to AI-assisted mathematical proofs: it is possible to produce a proof that is correct because it is provably so, but not understandable by people. Likewise, he expects we will see more and more technology built by LLMs that the people responsible for it do not fully understand.

Noel landed in the middle. He hesitated on the survey question about pushing mentees toward LLMs. He had been challenged by a real situation: someone relying on LLM output too much, not understanding the material deeply enough. “The goal of a starter project is not to deliver in record time,” he said. “It is to learn.” In a smaller company where productivity is the goal from day one, he might push people toward LLMs from the start. The context matters, and he is honest about the tension.

Dani offered a different angle. When asked what topics a guide for early-career professionals should cover, he focused on the ability to evaluate the LLM’s approach: getting an agent to convince you about the approach it has taken, and how to review that approach, something junior engineers lack the experience to do. His observation is that they are simply “lost.” This complements Samir’s position: even among practitioners who push newcomers toward LLMs from day one, the core challenge remains learning to judge what the tool produces.

The early-career boundaries extend beyond just code generation. Kai warns against fully delegated agents consuming project tickets. That approach is not adequate for early-career developers who need to build a deep understanding of the project. I recommend against using LLMs for planning in early career: if you have not yet developed the judgment to evaluate a plan, having a machine produce one does not help you and may actively mislead you. At any career stage, Kai draws a firm line at giving agents full access to production systems or personal data.

But it is not only juniors who put the tool down. Ben, despite being one of the more trusting users in the group, deliberately avoids using LLMs for performance work. He could dump performance data into the model, but he feels he wants, and is responsible, to develop a deep mental model of how the technology works himself. He has not yet had the opportunity to direct an LLM to work with graphical representations like traces and flame graphs, on which his intuition currently relies. Only 3 of 8 respondents report improvement in performance work (Dani, Sam, and I). Sam’s experience is the most specific: LLMs handle boilerplate and harness setup well, but do not ask the right questions when it comes to designing the experimental setup itself. Even among those who see gains, the domain falls short for most.

Luca embodies both sides of the dependency tension in a single observation. “People get fewer opportunities to make mistakes,” he said. “They do not learn the best coding practices and patterns by heart.” He used the example of a concept like “variables escape to the heap,” the kind of deep knowledge that comes from making errors and debugging them yourself. But he holds the other side too: “If that knowledge is still important, it may still come back in other areas. If it is not important, maybe it is acceptable to lose it.”

The dependency concern runs deeper than any single task. Samir sees a structural risk: if the rate of adoption outpaces the rate at which we develop safeguards, by the time the problem is visible it may already be too late to correct. When LLMs fail, humans will need to fix the issues, and they will need the understanding that was never developed. Whether the low-level knowledge that junior developers are no longer acquiring is permanently lost or merely deferred is a question none of the practitioners I spoke with can answer. What they can tell you is that the question is real, and that knowing when to put the tool down is itself a professional skill.

6. How does this change working with other people?

LLM use is usually framed as an individual productivity question: does it make you faster, does it make your code better. But programming is team work, and the most surprising findings from my conversations were not about personal workflow. They were about what happens between people.

Team awareness of LLM usage is uneven. I asked each respondent to rate, on a scale of 1 to 5, whether all members of their team are equally aware of everyone else’s LLM usage. The median was 3.5, the range 2 to 5. Three respondents (Kai, Noel, and Samir) reported full awareness. Two (Ben and Dani) reported low awareness. Teams do not uniformly know what their members are doing with these tools, even when 5 of the 8 respondents work at organizations where LLM use is mandated.

The social friction is new. It did not exist a year ago.

Ben described the sharpest version of this friction. A coworker, he said, uses LLM-generated reviews and submits them without human edits. This feels impersonal and places a burden on him. He went further: juniors on his team now produce low-quality output, “slop,” and do not realize that is what they are doing. In the past, their incompetence acted as a natural brake on output volume. That is no longer the case. He does not enjoy having to explain their own code to them. The friction, he emphasized, is mostly social in nature. And he added, with characteristic honesty, that the social issues might not exist anymore a year from now.

The team norms for LLM-assisted work are still being invented. Knowledge transfer happens through a patchwork of methods: show-and-tell sessions, shared rule or skill file repositories, published MCPs (a standard that lets LLM tools connect to external data sources and services), dedicated knowledge transfer days, demos, and proactive sharing. Samir’s organization mandates knowledge transfer and facilitates it with dedicated days and shared repositories. But even there, he observes coworkers doing “something cool” only after the fact; he assumes people keep a lot of their inventions to themselves. The infrastructure for sharing LLM workflows is immature, and even well-structured organizations have gaps.

Luca described what happens when mandated adoption meets individual resistance. He was onboarding a new hire who was AI-skeptic and was told they had no choice. Luca’s response was characteristic of his approach: “Everything I know is in my cursorfile” (a configuration file that encodes his standards, patterns, and expectations for the LLM tool). “So I’m expecting them to read the cursorfile.” Who decides how the team works with these tools, and who has to comply, is a question of organizational power that most teams have not explicitly addressed.

Dani offers a complementary view on this gap. He reported one of the lowest team-awareness scores in the group. His survey response about obstacles for juniors (that they lack the experience to guide an agent, and that reviewing approaches taken by the agent is something junior engineers have no basis for doing) points to a team-level problem, not just an individual one. If the junior on your team cannot evaluate what the tool produced, and the team does not know how each member is using the tools, the gap compounds. Quality problems become invisible until they surface in production or in review.

Luca’s observation that “reviewing is more tiring than writing” captures the imbalance that LLMs introduce into team workflows. If generating code becomes easy but reviewing remains hard, the bottleneck shifts to the people who must evaluate the output. His vision is that review expectations should eventually be encoded into bots: “If you can express your expectations into language, then the machines should be able to pick it up.” Whether that vision arrives soon enough to ease the current friction is an open question, but it points toward a future where norms are machine-enforced, not just agreed upon socially.

I want etiquette rules to be taught early: what is polite and not polite to do with LLM use in a team. The threshold for “too simple a question to ask a person” has shifted, as Luca noted; more people now have a good intuition about which questions an LLM can answer easily. But the threshold for “too lazy an output to submit for review” is not yet established. When you submit LLM-generated work, whether it is code, a review comment, or a specification, you are making a claim about the effort and attention you invested. If your colleagues cannot tell whether you thought about the work or just accepted the first output, the social contract of professional collaboration is under strain. These norms will settle. But they have not settled yet, and if you are entering the profession now, you will be part of shaping them.

7. What else should you learn?

This guide covers the questions that came up most consistently across eight practitioners. But I also asked each of them what topics they would want included in a guide for early-career professionals. Their answers point to areas this guide does not cover in depth, and that are worth pursuing on your own.

Security. Two respondents raised this explicitly. Kai warned against giving LLM agents access to production systems or personal data at any career stage. I stressed legal liability: if an LLM produces code that mishandles user data, the human who shipped it is responsible. Security is a large topic and it predates LLMs, but LLMs add new surface area. Code that looks correct can still leak data, bypass authentication, or introduce injection vulnerabilities. Learning to recognize these patterns is essential.

Context hygiene. Luca mentioned this: how you structure and maintain the information that LLM tools use when working on your project. This includes rule files, project documentation, and the knowledge you expose to the model through configuration. If the model’s context is stale, incomplete, or contradictory, the output will reflect that. Managing this is a practical skill that most people learn through trial and error.

Tool configuration and setup. Kai raised user skills like tool configuration and MCP setup. The glossary in this guide defines these terms, but knowing what they are is different from knowing how to set them up for your project. The tooling changes fast. Learning how to configure your own environment, rather than relying on someone else’s defaults, is time well spent.

Maintainability. Samir raised the question of how to ensure an LLM-assisted codebase does not become a mess. When code is generated quickly, consistency and structure can degrade. He specifically recommended choosing tools, languages, and frameworks that have built-in automation for maintaining coherency, rather than relying on agents to maintain it after the fact.

Task delegation. Several practitioners use sub-agent orchestration, but the skill of deciding what to delegate, how to scope the delegation, and how to verify the result is not well documented anywhere. Samir described this as a key area: architecture, problem decomposition, and task delegation. As the tools become more capable of autonomous work, the ability to direct that work becomes more important.

People skills. Ben made the point most directly: “The topics I care about with juniors are about people, and these haven’t changed much with LLMs.” Clarity of expression, managing risk in changes, connecting people based on skill and interest. These are not LLM topics. They are professional skills that matter more, not less, when much of the implementation work is automated.

Learning paths and orientation. Sam raised a question that sits behind all the specific topics listed above: how do you know which skills are worth developing, and when? He identified the absence of evaluation criteria as the gap: early-career practitioners lack a structured way to judge whether learning a particular skill would be valuable in their current job, and they lack paths to reach expertise in the underlying technology when they need to. Some topics, he noted, would not naturally surface through practice alone; software modeling with LLMs is one example. Figuring out which subjects fall into that category, and then deliberately pursuing them, is itself a form of professional skill.

8. What is software engineering supposed to be?

An earlier version of this guide used the phrase “professional programming” throughout. The choice was deliberate: it named the specific activity most visibly affected by LLMs. That framing turned out to be too narrow, and this section is an attempt to explain why.

“Programming” describes one part of a larger job: converting a specification into working code. It does not name design, requirements analysis, validation, or accountability for whether the result actually serves the people it was built for. Those activities have always been part of what the title “engineer” was supposed to imply.

Engineering, in its original sense, is a licensed profession. Civil engineers design systems that will bear loads they have not yet seen, validate those designs against standards with real consequences, and are legally responsible for failures. Software engineering borrowed the word and, for a while, aspired to the substance. In practice, the industry fragmented the work into specialized roles: product managers take requirements, architects design systems, QA teams test, and “software engineers” implement. Many people today who carry the engineer title work primarily as implementers: given a specified task, they produce the code.

That narrow implementation layer is precisely what LLMs are well-suited to attack. Converting a clear specification into working code is the task current models handle most competently. If software engineering has quietly become mostly that, then the threat is proportionate.

“Development” carries a parallel history. Developers were originally meant to carry an idea all the way from conception to market: talking to customers, owning outcomes, bearing accountability for whether the product served the people it was built for. Here too the industry insulated the role. Many developers today work from tickets, without contact with customers, and without responsibility for what their work does in the world. The narrow scope of what “development” became is what LLMs are targeting. Someone who owned the full arc (problem, solution, outcome) would find these tools amplifying their work, not replacing it.

The specialization happened for reasons, and the structure shaped the people who worked within it. But the disruption is clarifying something the structure had obscured: the parts of the job LLMs cannot yet do—design, specification, verification, customer accountability—are also the parts that engineering and development, in their original sense, were always supposed to include.

Every practitioner in this guide who articulated a path forward pointed toward exactly those activities. Luca talks about directing: knowing what you want and encoding it precisely enough that a tool can follow. Samir talks about verification: the ability to evaluate whether the output actually satisfies the requirement. Sam asks the harder question: how do you know which skills to own rather than delegate? These are not new skills. They are the skills the profession was named after, practiced imperfectly, and is now being asked to return to.

The next section addresses the career question directly. But the framing matters. The job is not under threat. The narrow version of the job is.

9. Will you still have a job in ten years?

This is the question behind all the other questions. If you are considering a career in software engineering, or if you are early in one, you want to know whether the field you are investing in will still exist in a recognizable form.

I cannot give you a certain answer. Neither can the practitioners I spoke with, and their experience ranges from 3 to over 20 years. What I can give you is what they actually said.

All eight recommend learning about LLMs. The median recommendation score is 6 out of 7, with a range tightly clustered at 5 to 6. Nobody says do not bother. But nobody is at the ceiling either. Even the most enthusiastic practitioners stop short of unconditional endorsement. When I asked whether they push mentees toward LLMs, the 4/3/1 split (right away, gradually, not at all) maps directly onto different theories of how careers will develop from here. The confidence in recommending the tools does not translate into consensus on how or when to start.

Luca offered the most direct framing of the shift. “We still want more people to direct the building,” he said, “but we don’t need as many people to write the code. We want the people to do the building instead.” Implementation is being commoditized. Architecture, judgment, and direction are not. His focus for junior development has shifted accordingly: conversations now center on architecture and structure, and less time is spent on writing code. He would like less-senior people to become better at architecting and structuring things, because that is where human value is concentrating.

The practical advice that emerges from the data is consistent: invest in the skills that are harder to automate. Architecture, problem decomposition, verification, specification work, communication. Ben framed it as an amplifier effect: “LLMs are amplifiers. You need to be a good software engineer to start with before the technology can amplify your work.” The tool makes a competent engineer more productive. It does not make an incompetent engineer competent. Two of the eight respondents specifically want architecture and design covered in a guide for early-career professionals. The human skill is moving upward in abstraction.

Dani offers a ground-level view of this shift. He is one of only two respondents who report improvement in performance work, a domain where most find LLMs fall short. His recommended learning approach (“no reading, just practice”) reflects a philosophy that the skill of working with LLMs is best built through direct experimentation, not theory. His survey response about what a guide should cover is telling: getting an agent to explain its approaches, and learning how to review those approaches, which junior engineers currently lack the experience to do. He is describing judgment developed through hands-on iteration. His own results in performance work suggest the approach works. His philosophy may also point toward an answer to the pipeline question that follows: use the tools constantly, fail with them constantly, and build judgment through the friction.

Samir raised the concern that extends beyond individual careers. If groups can adopt this tooling and get a large boost in efficiency, he observed, then it becomes very hard to compete, leading to wealth concentration. He offered a thought experiment: consider the gap that currently exists between isolated societies and those in modern cities. It is possible that a similar gap will appear within existing levels of global societies, between those who adopt the technology and those who do not. The observation applies beyond programming jobs. It is about what happens when a powerful tool is unevenly distributed.

Ben added the labor-market dimension from the employer’s side: LLMs enable delaying hires and keeping teams small. “Keeping the team small is key from a strategy perspective,” he said. This is already happening. If fewer junior positions are created because a small team with LLMs can do the work that used to require a larger one, the entry path to the profession narrows. Luca sees this too: “People get fewer opportunities to make mistakes.” The paradox is real. If junior roles shrink, how do people build the experience to become senior? Luca acknowledges the tension but has no answer. Neither does anyone else.

There are two things I can tell you with confidence. First, the profession is changing, not disappearing. The practitioners here, people who build and maintain real systems for a living, all still believe the work matters and that learning the field is worthwhile. Second, the skills that will matter most are the ones this guide has been describing: understanding what the tool is, knowing when to use it and when to put it down, reviewing output critically, communicating clearly with other humans, and thinking about problems at a level of abstraction that machines do not yet handle well. The ground is shifting, and nobody can tell you exactly where it will settle. But the direction of the shift is visible, and you can move with it.

10. What do you actually do with all this uncertainty?

Every practitioner I spoke with recommends learning about LLMs. The median recommendation was 6 out of 7. The range was 5 to 6. Nobody hit the ceiling. These are eight people who use the tools daily and still stop short of full endorsement. They have seen enough to know the tools are valuable. They have also seen enough to know the story is unfinished.

I cannot tell you how much review is enough. Samir reviews everything, always, because his name is on it. Ben deliberately reviews less when the stakes are lower. Both positions are considered. Both come from experience. The line you draw will depend on your context and your own developing judgment.

I cannot tell you when to put the tool down. Luca watches his team members lose opportunities to make mistakes. He does not know whether that lost knowledge will come back through other doors. Neither do I.

I do not know whether the junior roles that train people into senior roles will survive in their current form. I do not know whether the review burden will break teams before better tooling arrives. I do not know what these tools will be capable of in two years. Nobody does, including the people building them.

Here is what I do know. The practitioners who get the most from these tools experiment constantly. They notice when the output feels wrong. They build mental models of where the tool works and where it fails, and they update those models when the tools change. They stay curious without becoming dependent.

Stay skeptical. Stay curious. Learn the fundamentals well enough to judge what any tool produces. Talk to experienced people and notice where they disagree. Uncertainty is uncomfortable, but in a field that changes this fast, it is also the right response.

The ground is moving. Keep your eyes open.

Raphael Poss is an entrepreneur who occasionally publishes field notes on systems, leadership, and the messy edge between technology and people.

Interested to discuss? Leave your comments below.

Comments

LLMs in Software Engineering What Experienced Practitioners Actually See

Opening

1. What are you actually working with?

2. Does it actually make you faster?

3. How do you tell it what to do?

4. How do you catch what it got wrong?

5. When should you put it down?

6. How does this change working with other people?

7. What else should you learn?

8. What is software engineering supposed to be?

9. Will you still have a job in ten years?

10. What do you actually do with all this uncertainty?

Comments

Keep Reading

Reading Time

Published

Category

Tags

Stay in Touch