Essay 03 · Practice

Getting your hands dirty

What it actually feels like to work with AI on complex intellectual tasks — and why the gap between talking about AI and using it matters so much.

By Anthea Roberts and David Wilkins 4 April 2026 18 min read

Interactive. The shift from doing to directing — Director, Coach, and Editor as the three modes of human–AI collaboration. Open the full visualisation →

This essay is part of AI, Complex Decision-Making and the Future of the Legal Profession, a project of the Center on the Legal Profession at Harvard Law School. We drafted the first essays in this series together at Harvard at the end of March, while teaching a course on AI and the Future of Law and hosting an event on agentic AI and complex decision-making. In our first two pieces, we argued that law's structural position makes it the gateway drug for AI across professional services, and we examined Harvey's Strategic Evolution on top of that position. Those pieces were analytical — mapping forces, naming patterns. This one turns inward. What does it actually feel like to work with AI on complex intellectual tasks? And what's stopping the profession from getting there?

A law student asked us whether her degree would be worth anything by the time she graduated. A partner with twenty-five years of experience asked whether his judgment was about to be devalued. A professor told us she didn't know whether to teach her students to use AI or protect them from it.

Everyone in law is feeling some version of this fear.

David always says the same thing when we encounter this: you have to name the fear. Don't pretend it isn't there. Some of it is well-founded. AI is changing what expertise means. The tasks being automated are real tasks. The firms staffing with AI agents are real firms. The funding is real money. These are not imagined threats.

But some of the fear comes from a different source: not understanding what the tools actually do, because you haven't used them. The gap between talking about AI and working with AI is enormous. And much of the legal profession remains on the talking side. In Harvey's Strategic Evolution, we noted that adoption in terms of licenses is reportedly high but actual use remains relatively low. That gap — between buying and using, between knowing about and knowing how — is what this piece is about.

Our series introduction promised we were "living it from the inside — experiencing firsthand the gap between model capability and domain expertise." Here we deliver on that promise.

What We Learned Working Together

Anthea has been working intensively with AI chatbots, APIs and agents for the last few years — not just as research tools but as thought partners in complex analytical workflows. She builds with them, reasons with them, has rebuilt how she works around them. She'd been telling David for two years that he needed to get his hands dirty — that talking about AI and using AI were different things entirely. He'd said yes, yes. And hadn't done it.

David is not AI-reluctant. He's interested, curious, has talked and written about AI and the legal profession for years. But he hadn't put sustained time into developing the practice — sitting with the tools, learning through repetition and failure and surprise what they actually do. He knew in theory that AI was not just a souped-up version of Google. But he didn't have practice directing and iterating with AI or interacting with agents.

This isn't a David story. It's a pattern we see in almost every senior leader we talk to. CEOs, managing partners, general counsel, senior academics. The reasons are remarkably consistent. They're already domain experts — learning something new from the basics feels like going backwards. They're senior and overcommitted. There's always someone else who can handle it. And they understand it in the abstract. They've read the reports, attended the conferences, written the memos. Isn't that enough?

It isn't. That became clear to both of us when we started this series.

Anthea brought her Dragonfly agents into the process. They searched for David's speeches, articles, articles about his articles — the system primed itself with decades of his thinking about the profession. Agents scanned broadly across AI and law commentary to provide a factual basis for the analysis to complement our observations. We sat together, working through how ideas connected, with the agents as active participants in the dialogue. Anthea would direct agents in ways that made David stop and say: "Wait — you just did what?" And seeing her do one thing sparked ideas for others: "Does that mean you could also do this?", he started to ask.

Anthea was not just completing an academic project — she was teaching David AI techniques as they worked. Everyone focuses on prompt engineering and that is important. But context engineering is also key. Anthea showed David how to provide agents with relevant context — grounding the analysis in specific data and in their own viewpoints rather than letting the model default to generic commentary. And she warned about the opposite danger: too much context crowds out the agent's capacity for the critical thinking that makes it a genuine thought partner. Context is important. Context rot is a problem.

We both think better through speaking than writing, so Anthea showed David how to start drafting by dictating thoughts through Wispr Flow — capturing ideas verbally to open the conversation with the agent. She showed how to get the agents to respond to and critique their hypotheses. We directed agents to undertake different types of analysis and reviewed their outputs. David would identify places where a plausible, fluent output missed something that only decades of studying the profession would reveal.

At times an agent would surface a connection neither of us had seen, and it would spark new discussions. For example, the discussion of the dark side of law as a gateway drug from our first piece — that originated from an observation from a Dragonfly agent. David had been calling law a gateway drug for years. Dragonfly pointed out that the difference between the metaphor of a gateway (positive) and a gateway drug (negative) suggested we should also explore the downside of law's increasing centrality. That seemed like a good point, we thought, particularly given Dan Wang's contrast in Breakneck of China's engineering state and America's lawyerly society — which we discuss in our first piece.

The experience revealed something that Anthea has seen time and time again. Abstract understanding of AI and practical understanding are different in kind, not just degree. David knew what large language models could do. He'd read the benchmarks, the capability assessments, the case studies. He had seen Anthea present on her work and had talked to other people who were using the models. But there is a difference between knowing that AI can surface unexpected connections and watching it do so on your own material — and then having the judgment to see that one of those connections is brilliant and another is nonsense. That second kind of knowing only comes from the experience itself.

This is how you learn AI. Not from a presentation. Not from a report. Not from a two-day workshop. It's experiential. The Greeks had a word for this kind of knowledge: metis — practical wisdom, as distinct from theoretical knowledge. You don't get metis from textbooks. You get it from doing. You have to sit with the tools, get frustrated when they produce mediocre output, learn to direct them better, develop a feel for when they're confident and when they're bluffing. You learn how much context is too much, when to break a complex task into steps, how to recognise a plausible-sounding output that's subtly wrong. You develop the instinct that something is off before you can articulate why. That can't be delegated. But these skills can also pass between more experienced users and more novice users when you work together, as we did. You can learn through a combination of doing, watching, and copying.

In our view, when it comes to AI, everybody needs to learn how to do this for themselves. Leaders in particular. The instinct to delegate AI to others — to the associates, to the innovation team, to the chief technology officer — is understandable. It is also fundamentally wrong. You cannot have a real, visceral understanding of what this technology does unless you've lived the experience firsthand. You cannot make good decisions about AI's impact on your practice, your firm, or your clients from the outside.

The good news is that many skills senior people already have transfer directly. Managing, directing and reviewing the work of AI agents draws on the same capabilities as managing, directing and reviewing the work of junior colleagues. You set direction, provide context, evaluate quality, catch errors, make judgment calls about when to trust and when to verify. The hump is learning how agents work instead of how the people you manage work. That hump is real. But it's smaller than it looks — and far smaller than most senior professionals assume.

The Software Preview

Software engineering is the first profession to reorganize around AI agents, and the pattern is specific and transferable. Code production used to be slow and expensive. Now it's fast and cheap. But the bottleneck hasn't disappeared. It has moved. What's slow and expensive is review — reading what the agents produced, evaluating whether it's correct, catching subtle errors, making decisions about what to build next. Production speeds up. The human role shifts from doing to reviewing. The anxieties among engineers are familiar (will this replace me?), the breakthroughs are vivid (it just did in fifteen minutes what would have taken me a month), and the lessons are transferable.

If the pattern holds — and we expect it will — legal production will speed up dramatically and the bottleneck will shift to the same places. The lawyer's value won't be in producing the first draft of the contract or the brief. It will be in knowing what to ask for in the first place, knowing what questions to ask or what guidance to give along the way, and knowing when a plausible output is subtly, but consequentially, wrong.

Directors, Coaches, and Editors

Anthea has written about this shift as the move from primary actor to secondary actor — from the actor, athlete, and writer to the director, coach, and editor. Human judgment enters at three points: defining the problem and approach (director), course-correcting during execution (coach), and verifying the output before it goes out (editor). You are still doing important work, but your work has moved up one level of abstraction.

These managerial or orchestration roles are not new to senior lawyers. A partner who directs a team of associates on a complex transaction is already a director. A senior counsel who sits with a junior during a negotiation prep session, redirecting their analysis when it drifts, is already a coach. A GC who reviews an associate's memorandum before it goes to the board is already an editor. The shift isn't from doing to not-doing. It's from directing people to directing agents — and many of the core capabilities transfer.

But that transfer has a precondition. As any sports fan knows, the best athletes rarely make the best coaches, and vice versa. Great players often perform instinctively — they can't easily articulate what makes them successful. Wayne Gretzky famously explained his goal-scoring by saying he skated to where the puck would be, not where it was. Inspiring, but not exactly a coaching manual. A great coach translates that intuition into concrete, teachable steps: watch the angle of your opponent's stick, know their tendencies, anticipate where the pass will go. The skill of performing and the skill of directing are different in kind. Senior lawyers know this intimately — everyone has seen brilliant practitioners who never learned to direct and review, who became micro-managers rather than orchestrators, undermining productivity and sapping team energy. The capabilities that transfer to AI direction aren't the capabilities of doing excellent legal work. They're the capabilities of orchestrating others who do it.

There's a crucial distinction that the transfer metaphor can obscure. With people, you can often delegate and step back — the associate carries institutional knowledge, the junior knows the formatting conventions, the team self-corrects on routine matters. With agents, you need to be all three: director, coach, and editor, often within a single working session. You set the direction, watch the execution unfold, course-correct when it drifts, and verify the output before it goes anywhere. The roles are analytically distinct but practically inseparable. Generative AI does not generate alone — the quality of its output depends as much on the nature of the collaboration as on the strength of the underlying model.

Each role draws on domain expertise and AI expertise simultaneously — and neither substitutes for the other. The director needs domain expertise to know which risks matter for this counterparty, and AI expertise to know how to frame the request. The coach needs domain expertise to recognise when the analysis has drifted, and AI expertise to know how to redirect. The editor needs domain expertise to know what "right" looks like, and AI expertise to recognise the characteristic patterns of AI error. Domain expertise is what makes the directing, coaching, and editing valuable. AI expertise is what lets you do it through this new medium. The first you already have. The second is what this piece is asking you to develop.

But this raises a question we'll explore in a forthcoming piece on the training crisis. We've argued that domain experts need to learn AI — that the hump is crossable. But what about the other direction? If AI automates the work through which domain expertise was historically built, who becomes the next generation of directors, coaches, and editors?

Tools Versus Technique

We'll return to that question. For now, the practical one: how do you actually develop the AI side? Anthea has written about the capability gap — the distance between having AI tools and knowing how to use them — and three levels of AI fluency that map onto the profession. The frameworks matter, but the experiential point matters more.

Anthea has seen this in her own team. Even in a small group of fewer than ten people who were ready and willing, the technique for working well with AI didn't spread on its own. It required sustained, applied, supported practice — sitting with people as they learned, passing over tips, working through problems together. Nearly everyone had an "aha" moment. But it took days or weeks, not hours.

The technique doesn't transfer by osmosis. It transfers through doing — and ideally through doing together, as we did.

The Fear Revisited

So what happens to the fear when you actually cross the gap?

Senior professionals have an advantage they may not appreciate. Twenty-five years of judgment combined with AI technique produces something enormously more effective than either alone. You are not starting from zero — managing agents draws on capabilities you've spent a career developing. But that potential only activates if you invest the time. Delegating instead of learning means losing the capacity to evaluate what the technology is doing to your practice, your clients, your firm's competitive position.

Some fears are validated. AI does change what expertise means. The tasks being automated aren't coming back. The competition from AI-native firms is real and accelerating.

Some fears are dissolved. Using AI doesn't necessarily undermine critical thinking — it can demand more of it. The director/coach/editor model puts human judgment at the center, not the periphery. The work changes in form. The need for judgment doesn't diminish.

And some fears are inverted. The real risk isn't that using AI will devalue your expertise. It's that not using it will leave you unable to evaluate what it produces — and unable to compete with those who can. It is one of the scariest times to be a lawyer. But it is also one of the most exciting. The scare and the excitement come from the same source: a technology that rewards those who engage with it and marginalises those who don't. The double bind we noted in Harvey's Strategic Evolution — if the tool isn't reliable enough, lawyers don't trust it; if it's too reliable, it threatens them — dissolves when you actually use the tools. What replaces it is something more productive: a working relationship with capabilities and limitations you understand from direct experience, not secondhand accounts.

What This Piece Asks

Our first two pieces mapped the structural landscape — why law is the gateway, what Harvey is building on top of it. This piece asks a more immediate question: are you willing to get your hands dirty?

The gap between those who work with AI and those who talk about it is widening faster than most people in the profession realize. It won't close on its own. It won't close by hiring the right vendor or subscribing to the right platform. It closes only through practice — sustained, applied, often uncomfortable practice. Not a workshop. Not a pilot. Not a memo from the innovation committee. The kind of practice where you sit with the tools, fail, learn, and develop the judgment that only comes from doing.

The profession has spent a century building a structural position at the intersection of everything that matters. Whether it holds that position in the age of AI depends, more than most people appreciate, on whether the people who occupy it are willing to learn the new medium. The fear is real. The experience is the only way through it.

In our next pieces, we'll step back out to the system level — examining the competitive forces reshaping the legal market, the training crisis that threatens the profession's capability pipeline, and where elite lawyering goes when the base of the pyramid is automated. But the question this piece poses is prior to all of those: none of it matters if you won't sit down and do the work.