What legal agents inherit from coding agents: Lessons from Legora
Chapters11
Introduces Legora as a platform for collaborative AI for lawyers and notes a key realization to adapt agent learning for legal work.
Legora’s Jacob Emiling shows how coding-agent patterns—planning, sandboxing, doc editing, linting, and domain-specific tools—inspire legal-AI workflows that are reliable and auditable.
Summary
Jacob Emiling of Legora argues that the rapid advances in coding agents offer a blueprint for building high-quality legal assistants. He notes Legora’s mission to give lawyers an AI-based workspace that handles end-to-end tasks for over a thousand customers, including large law firms. The talk frames coding and legal work as parallel knowledge tasks, both text-heavy with strict review and signing processes, and then maps concrete patterns from coding agents to Legora’s legal domain. Three reusable buckets emerge: one-to-one UX patterns (planning, human-in-the-loop, safe tool calls), translatable patterns (borrowed architectures adapted to law), and domain-specific inventions (citations, due-diligence workflows). A key emphasis is on grounding answers with citations and on working with large doc sets rather than single documents. He also shares concrete engineering shifts, such as treating document editing like a looped read-edit-verify cycle and building an ESLint-like linting layer for contracts. The live demo showcases a project with employment agreements and a table-review tool that extracts structured data from hundreds of documents, evidencing the practical power of a coding-agent-inspired harness in a legal setting. The takeaway is that by borrowing from coding agents and inventing domain-appropriate tooling, Legora’s agents become more trustworthy, auditable, and scalable for complex legal tasks like due diligence. The overarching message: look to coding agents first for patterns, translate where possible, and build new, domain-specific capabilities where needed.
Key Takeaways
- Planning-based task execution: start with a detailed plan and iterative approval before the agent acts, mirroring coding-agent UX for complex legal tasks.
- Human-in-the-loop safety: require explicit approval for dangerous actions and tool calls to prevent accidental document loss or leakage.
- Read-edit-verify editing loop: employ a loop with read, edit, and verify steps using a docx-at-scale representation to achieve surgical edits across many pages.
- Document-style guidance via linting: apply an ESLint-like verifier in legal work to catch reference integrity and other mechanical errors before deployment.
- Domain-specific invention: implement end-to-end due diligence workflows with tools like table review to extract and organize data from hundreds of contracts.
- Pattern reuse across domains: leverage planning, sandboxing, and tool-usage patterns from coding agents as universal foundations for any vertical AI agent.
- Live-demo validation: real-world tasks (vacation policy updates, contract edits, mass document review) demonstrate the practicality and reliability of the proposed harness.
Who Is This For?
Essential viewing for legal tech teams and lawyers exploring AI agents: it explains how to transplant proven coding-agent patterns into legal workflows and where to innovate for domain-specific needs like due diligence and contract management.
Notable Quotes
"And today I want to talk about how we learn from coding agents when building an agent to do legal work."
—Intro to the core idea: reuse coding-agent lessons for legal tasks.
"We have over a thousand customers today, including some of the largest law firms in the world."
—Scale and credibility of Legora’s platform.
"The obvious next question becomes how do we make use of this and like how do we how do we get in on coding agents getting better and better to make our vertical better?"
—Motivation for mapping coding patterns to legal domain.
"There’s basically three buckets how you can learn from coding agent agents as we found."
—Framework introduction: reusable, translatable, and invent.
"If you want to apply it to legal tasks, you can basically mirror the tool design from coding agents and get a lot of benefits from it."
—Key insight about harness design and reinforcement learning alignment.
Questions This Video Answers
- How can planning mode from coding agents improve legal workflow automation?
- What is the table review feature in Legora and how does it help due diligence?
- What makes a legal document editing loop different from code editing, and how can it be implemented safely?
- How does Legora implement a linting system for contracts, similar to ESLint for code?
- What are concrete examples of domain-specific inventions for legal AI agents like due diligence and document citation grounding?
LegoraLoraAI for lawyersCoding agentsAgent UXPlanning modeHuman in the loopDocument editingTable reviewDue diligence
Full Transcript
Please welcome to the stage staff software engineer at Lora, Jacob Emiling. Hi, I'm Jacob. I'm an engineer at Legora where we are building collaborative AI for lawyers. And today I want to talk about how we learn from coding agents when building an agent to do legal work. But before we get into it, a bit of context about what we do at Legora. We're basically building um AI based workspace for lawyers to do end-to-end legal tasks on. And we have over a thousand customers today, including some of the largest law firms in the world. We originally founded in Stockholm, valued over $5 billion now, and did kind of a record sprint from 1 to 100 million in ARR.
But that's not really why you should listen to me today. This year is six months ago, we had a realization of how we built agents and um realized that we need to do something different. And this all stem from this chart that probably everybody in this room here is very familiar with. We've all seen over the last years coding like AI and coding going from bad autocomplete to good autocomplete, chatbots, agents, and background agents and beyond. And it doesn't look like it's going to stop anytime soon. I mean, we have heard talks this morning about new stuff in cloud code and how organizations are leveraging AI to make software engineering more efficient.
And it's just going to like keep getting more and more important for our day-to-day work as software engineers. And uh the interesting part when looking at this is and like if you're building in any other vertical you probably noticed a similar thing is that other verticals outside of coding we're actually quite behind looking back like 6 months ago. And uh this is a realization we had back then and we thought we thought a lot about like why does this and like what are the reasons coding is accelerating so fast? Why are agents so powerful for coding and not for legal work yet?
And um we started like thinking about like in which ways is coding similar to legal work and like what are the analogies that we can draw to to learn from it. And there's actually a lot of parallels once you start looking into it. For example, both are both coding and legal work are based heavily on prior work. Both lawyers and engineers work a lot with text based documents. There's strict conventions within organizations and firms. And there's also like a strong review culture. For example, as engineers, we are reviewing each other's pull requests before call ships to production or now also the pull request of agents and lawyers do a similar thing where like an associate would draft a document and before it goes out to any client, a partner will review it and sign off on it.
And this is not something like magical about legal and coding and some unique combination of those two that leads us all this leads to all these parallels. This is actually there's a lot of parallels between any kind of knowledge work and coding. So if you like build in any other vertical it's a very interesting exercise to do and uh I mean after looking at this the obvious next question becomes how do we make use of this and like how do we how do we get in on coding agents getting better and better to make our vertical better and this is basically there's basically three buckets how you can how you can learn from coding agent agents as we found first of all there's stuff that you can reuse onetoone stuff like to-dos planning sub agents sandboxes human in the loop It's a lot of things that over time coding agents figured out how to solve like a certain UX or how to make agents good at certain longunning tasks and these things turned out to be pretty universal for agents in general.
And it's also exactly the things that you get for free when using like the anthropic agent SDK or managed agents even. Next up is a bit more interesting. There's stuff you can translate and these are things that look very similar in your domain to some some sub problem in coding and um it's not as simple as just copying them. you kind of need to like look at the pattern of how coding solved something and translate that pattern to your domain. And we're going to look at some examples uh of all of these. So, it's going to get a bit more clear hopefully.
And lastly, probably the most exciting part because it's actually about building new stuff is stuff you need to invent for your domain. And for us, that's stuff like grounding every answer in citations so lawyer lawyers can verify where certain claims in in the agents output come from or working with large sets of documents for due diligence use cases. Okay, we're going to walk through all of these different ones now after I grabbed a bit of water. First off, we're going to look at planning mode and the human in the loop as an example for what you can reuse from coding agents.
You all probably notice when you work on like a a bigger bigger task together with cloud code or your coding agent of choice, you start instead of just like firing off a prompt and let it rip for a few hours, you start off like planning like a detailed plan of how the work should be done. And this basically solves a few different things. First of all, you're exploring the problem together with the agent. you're gathering a lot of context and you're also making upfront decisions so that then the agent doesn't need to make decisions for you when it actually executes the work.
And yeah, it turns out this works exactly the same way in legal if you wanted to. Um if there's any bigger legal task a lawyer would work at now in Lora with our agent, um they would just plan all the work beforehand, iterate on a plan and when they're happy with all the decisions and assumptions made, they can let the agent execute on the plan. So it's literally just like a onetoone translation of the UX that coding agents have established and found out to work very well for human agent collaboration. Next up, a similar example, approval of tool calls or in general like dangerous actions.
Like for example, in your coding agent, you don't want your agent to execute like random shell commands that are not sandboxed. You want it to like ask you is like should I run this? Is this safe? And you answer yes or no? and then the agent acts accordingly. It's a similar thing in in an agent working with legal stuff. You don't want it to randomly go and delete a bunch of important client documents. You want to be in the loop for certain actions. So same again as the established UX and we just took it as it is and built it into our agent and we don't and therefore we didn't need to do the whole iteration loop of figuring out what is the perfect UX for this use case.
We could just take what coding agents already figured out and apply to our domain. Okay, next up is going to be a bit more interesting. Um, it's about document editing and it's something where we needed to translate from the coding agent domain to our domain. To give a bit of context here, lawyers basically love working with Microsoft Word files. They spend a lot of their time in Microsoft Word drafting stuff and uh like redlinining documents, reviewing documents and going back and forth. So obviously if you build an agent for legal work, this is something you need to solve very well.
And uh for us I mean this is also something we solved quite a while back in some way and that's what I want to look at firstly and walk you through how we solved this problem before after we had this grand realization of doing it differently. Um okay maybe also interesting to know why why like doc x editing is a bit more challenging than just editing plain text files. A dog x file is basically a zip file of a bunch of XML files with a lot of metadata and a lot of noise in there. So it's not as simple as like editing a markdown file.
So what we did initially we had some kind of top level agent handing off an editing intent to another reasoning model and then that model got context about the current document that needs to be edited and a bunch of instructions on like high level what should be edited and it then reasoned about okay which edits to make throughout the documents. So if you have for example like a 50page template that you want to fill out that model would then figure out oh I need to insert something on page one page three page five page six but it doesn't write out the full edits because that would be like a lot of tokens and then the models used to get very lazy and just started like filling out the templates mid um mid uh document.
So what we then did with these like individual like editing markers where we know in this place we need to insert something like this in this place we need to insert something like this. We gave them to individual models and had them write out the full edits with like style information and also pay attention to what was before and after it to like synthesize it into the document. And this worked very well and it solved a lot of the exhaustiveness problems. But it also brought a bunch of other problems because basically what you have with this kind of setup is you have a lot of individual LLM calls with independent reasoning, different context, different tool tools and you just have like all these handoff problems.
So for example, if you add like a bunch of new tools and things your your top level agent can do, suddenly your agent starts handing off to this editing to uh like this editing reasoning model and gives it some instructions about pulling context via this tool that this editing model doesn't even have. So you run into all these weird weird issues like the more powerful your your agent gets. And uh of course you want you don't want to like limit how how powerful your agent can be and uh limit in too many ways. So if we contrast that to what all the coding agents out there converge to, we see it's quite different.
Basically like all the coding agents out there work in a way where they they just read, edit and verify things in a loop. Like you read usually line based. Um it's simple plain text files that these agents need to read. And editing then happens with some tool that either does string replace, patches or line based editing. And then afterwards you the model either reasons about what to do next or run some static type checking linting stuff like this. And um yeah when we when we looked at this and we I mean as engineers we on a daily basis used these coding agents we also tried playing with it on like bigger documents for example we threw in these large JSON documents and had it made like surgical edits to 20 places and it just worked and it didn't have all these weird exhaustiveness issues that we run that we run into with editing legal documents in the past and this made us very intrigued to basically change how our agent does document editing.
So what we tried then is basically the exact same thing just with a bit of different tool implementations to make it work. So we have the same editing loop. We have a read tool, an edit tool and a verify step and the reading happens in the form of an intermediate representation of the doc x file. We basically take the docx file transform it in a flat representation that is like a single textbased file that the agent can interact with. And then we have a bunch of editing uh tools that work on this and it can just go back and forth between reading and editing, see its own edits and keep looping.
And um yeah, I want to quickly talk about the first time we tried this out because it was quite a realization for us to see this work. So what we did back then was we built a P. So we got intrigued by coding agents solving these exhaustive editing things. So we build a P of editing docx documents in the same way and uh as any good P goes about you don't have any like structured EVA set you run against it like start veing it and talking to the model and see see how well it works and you have these like ideas about what things are hard and what are easy so I looked at this with a colleague of mine who is uh who was deep down the challenges of uh editing legal documents he he built our first word edin initially so he knows about all the hard parts about editing legal documents ments exhaustively and I asked him okay like what should we try with this like what's something hard what what was challenging before and um he said ah just like pass it this 10-page document and and ask it to translate it paragraph by paragraph from English to Swedish because apparently that was something that like in our previous setup or in uh with like previous models usually was very challenging to get done exhaustively.
So we did this. We passed the thing in, asked the agent to um translate paragraph by paragraph from English to Swedish and uh it just kept like started editing paragraph by paragraph as you would expect. Then sometimes it got a bit lost, started rereading the whole thing, sees, oh, I forgot a paragraph up here, goes back, edits the paragraph, and just keeps keeps doing this for like 10 minutes. And um at the end, we open up the document and uh everything was was translated. And uh the funniest part about this was that to test how good this like new harness and tool design works, we run this whole thing on haiku.
So this wasn't even like a good model. And uh I think is like this was really a moment for us where we realized okay this might actually work like there might actually be something like about mirroring tool design from coding agents and getting a lot of benefits from it. And I think the interesting part here is that or like my mental model is kind of that you want to have the model almost feel like it's inside a coding agent harness and it just does a legal task because then suddenly you get these benefits from all the reinforcement learning and fine-tuning that is done on the coding agent harnesses because your harness looks very similar in tool design and leads to very similar trajectories in in tool calling.
And um a lot of stuff you just get get for free which is pretty cool. Okay, that's it for editing. Another example I want to quickly go over is linting for legal documents. Yeah, this is exactly what it looks like. It's uh basically ESLint but for for legal documents and uh you can imagine I mean as as engineers we we use a lot of like static type checkers and tools and they are very powerful also for agents to get like a feedback loop on all the mechanical stuff that you want to have them do right and um turns out also in legal documents there's a lot of like static things that you you can actually verify that help an agent on like a bigger task.
For example, if you have like a big contract that references like a paragraph later in the beginning and an agent like removes the paragraph that's being referenced, like it's cool to have a like a static way to check that all references are still intact and and then give the agent this feedback loop of hey, you might want to edit the the section at the bottom where you reference this paragraph. So, this is also something very cool and uh you can you can take this way further by for example doing LM based things inside of this llinter to like lint more less mechanical stuff.
So you can like make this feedback loop feel very similar to coding. Okay, that's it for stuff that we translated while building our agent. Lastly, I want to talk about stuff that we need to invent when building our agent. And uh there's like equivalence for this in any domain. There's like very domain specific things that people working inside a domain need to solve every day. and you want your agent to solve that equally well as a human could do to get the best possible outcomes. A good example for this in our case is um due diligence.
So you can imagine you have two companies, company A buying company B and there's a lawyer in the middle who then gets the task to make sure everything is fine with this transaction. And what they need to do is basically they need to go through all the contracts that company B has with other parties and review them. And uh as one can imagine if this is like a large company they have a lot of contracts like thousands and thousands of contracts you need to review and also other other like binding documents and uh I mean of course lawyers do this this task today and there's a lot of tools that help them do this and we also have such a tool on the platform already it's called table review um it's what we see here on the on the slide also it's this gridlike interface where every row is a document and then you can do a structured data extraction by adding columns to it and specifying what you want to extract.
So instead of reading the whole document, you can then like lean on an LLM to extract the relevant pieces of information from you and help you get like a good overview of large document set and then you as a lawyer you would filter down on specific parties or specific red flags that you want to then like follow up on and dive deeper into. So if when we're building our agent to do legal work, we obviously need to give it a way to do the same kind of work. So what we do is we just give it access to use table review the feature that we have on the platform in the same way that a human would use it.
So our agent can go take a folder of documents throw it into this tool and then we generate all these like this cell values and the agent basically specify what to extract and then it can go and like filter down this huge grid of data and figure out what's what's relevant for for the task at hand. And uh yeah there's there's analogies for this probably in all different domains. For example, accountants would want to have some way to do reconciliation very mechanically. Doctors also have probably some very specific task they need to solve. So, you can like really think of this as like the last I know 20% of your agent to make it really well for for a specific specific domain.
Okay, that's the the three different categories of things. And now we're going to have a quick live demo of how this looks like in our in our agents today. We're going to pray that the live demo works. Okay. Is it visible? Yes. Great. So, this is Lora. You log in, you get a chat box like in a lot of AI tools. And we can look a bit around. I prepared a project that's called employment agreements. And if you look here, we have a bunch of files in here. It's not that many. We have a bunch of employ employment agreements for different fictive employees here and we have an HR policy that's uh like specifying how we like how we what kind of benefits our company has and a bunch of stuff like this.
So if we want to go in here now and give our agent a task, we can um ask it to I want to give every um employee an extra week of vacation during Christmas. And let's plan out the work we need to do for that. Drop in a prompt here. And um now what we expect the agent to do is to look first of all make itself familiar with its environment. It's searching for stuff here. Now for example it's searching for employment agreements, vacation, time off to figure out okay what agreements do we have? What policies do we have?
And then we wanted to like reason a bit about okay what needs to be done for this change and create us a plan. And now it's creating this plan. And if we look in here, we're going to see, okay, this is still streaming in. It says we need to do a bunch of steps here. First of all, we need to review all employment agreements. We need to amend some employment agreements to add the Christmas shutdown clause. Very nice. And then we also need to update our HR policy manual. And I mean the cool thing here is that like if you would drop an agent into a codebase and ask it to do a random thing, it would first go out and like collect all this context.
Like we didn't need to tell it that we have a policy that we need to update or we have five employment agreements we need to update. It basically goes and collects all this these things uh that need to need to happen for this change. And uh if I mean we could now iterate with the agent on this and tell it oh no we don't want an extra section for this or phrase it like this or whatever. But uh for the for the purpose of this demo, we are fine with this and we can send off the plan here and have it execute what we planned out.
And and what happens now under the hood is this exact editing loop I I talked about. First it starts reading the documents. Then it reasons about what edits to do. Then it calls some editing tool. Then it goes back and reads the documents to see that the edits were made and if everything is looking fine. So, it's uh yeah, thinking a bunch here now about how to do this, which is great because it means it's hopefully going to work. And this is the moment where good sales people start to talk about stuff. Not very used to demoing the product.
So, it's going to be a bit boring. Oh, but we should pretty soon get u some stuff back. Okay, now it's copying over the all the documents it wants to modify to a different space to modify them then so I can review the changes before they get written back and in a second we should start to get the first edit streamed in here. Yes. So it's good. Okay, if we collapse this we can see now that it starts editing the different deployment agreements and I can look in here. Ah, it's actually interesting. There was one employment agreement that I had from before for testing where it already had the Christmas shutdown clause, but now it decided to also unify this one to make the dates uniform to the other ones.
So, that's pretty cool. Um, but I basically see here what it's adding to the individual documents and all the edits it's making. And uh then I can also see the redlined version of the original document. So, if I click in one of the agreements here, um, okay, this is a boring one because it's the one that had the clause, but this is one of the employment agreements that it edited. I can go in here and I see redlinined with the right formatting in the right indentation and added a clause here to the benefit section about the Christmas shutdown that uh every employee is going to benefit from.
And now it also goes and updates the annual leave policy. And in the same way I can go to the policy here. This is our policy and then we should get a red line for yeah annual leave on the right section about the new thing we just added. And now it also gets a bit ambitious. and wants to draft a employee announcement memo. So communicate to our employees that they have for vacation now. Um yeah, we could like select know which which template to use for that memo and how that should look like and it's just going to go and and draft this thing.
But uh yeah, we don't need to wait for that because it's just going to spit out a document in the end. Okay, that's it for for this part of the demo. Another thing I want to show related to the due diligence like mass document review use case is uh this one here. So here we have another legora project with a bunch of files in there. We have like around a 100 files in here that are like random documents that might pop up in the due diligence of a company. For example, we have I don't even what's know what's in here.
It's like insurance policies, workers compensation. It's a random documents and I I of course don't want to go through them one by one and start sorting them in. So what I do instead, I take my prepared prompt here because this is too long to speak reliably and I put this in here and I'm basically asking the agent to Oh, we also quickly need to go somewhere else here. Got this the right one. So I'm basically asking the agent to do a structured review of all fights in this project. I want to know what categories of contract exists, which are the interesting parties, and if there's any red flags in the stuff.
Yeah, it's a very legal legal specific frontier. And uh then based on that, I wanted to also after it found all the different types of agreement, I wanted to put the different um employment agreements in a folder for future work. And again, like uh it doesn't want to jump right into the task, which is fair because uh it's not the most detailed prompt here. So it tries to like write up a plan of how to do this, but uh looks kind of fine. So, we're just going to have it do the thing and see what comes out of it.
And now it starts to first create a table view, which is this grid like document extraction thing I talked about earlier. And uh in a second, we should actually have this thing to look at it. That also figured out that there's like 100 documents here and that it will take a few minutes while the AI processes each file. It's very aware of of how this works. Okay. Okay, I mean under the hood now it needs to write out all the different files it want to it wants to put in this table review. So it's actually like a quite long tool call.
So that's why it's going to take a second for this to go through but uh it's looking good. Okay, there we go. So now it like it's still doing stuff and I could stay here now and it would like keep talking and keep doing the task. But uh what I want to do instead I want to actually open this up and look at this uh this part of the platform. So this is tablet review. We have you saw there's a grid on the screenshot earlier. We have um all our documents here in different rows. Can open them up and see the documents here.
And then we have columns that define data that needs to be extracted. For example, here the agent decided it wants to extract the document category and writes a detailed prompt out here what it means and how to how to extract this data point. And uh the cool thing here is that this is actually way more than just like a flat grid of data. It's actually a fully interactive reviewable surface of for data. So I can open this up here and um I can see the document on the right here and I can see the extracted data points.
And what I would do now um as a as a lawyer I would go in here and in some cases actually verify the answers. So for example I can see okay this document is flag as compliance regulatory. I can see the reasoning of DLM doing this task of uh why it was flagged that way and I can even click on here and get highlights in the document where it pulled that content from. And I mean this in this example is pretty boring because the document is like five rows. So, I'm going to find something a bit more interesting.
Are we going to see this in action? So, I have some employment agreements again here. They keep coming back. Um, so here I have a bit of a longer document and here it classifies it as employment agreement and I could then go here after I verified this and actually mark this as verified and then I can also see like the progress of verification is being tracked. So if you're like some more senior lawyer doing this kind of task, you could like ask an associate to review all the different data points here to double check and then you can collaborate on this surface with AI and also between different humans.
And then I can see also here it extracted the parties here. I can click on here for the citations get taken to the right part of document and here it also raised a bunch of red flags and concerns about uh stuff being placeholder in our employment agreements which is probably not that great in the real world. So yes, this is uh this is pretty cool. And if I go back here, then I should also see yeah, I don't know where it is. H yeah, here we see that the thing is verified. Get a cool green check mark.
So if we go back to the agent that's hopefully finished our task in the meantime, we actually came back with something. So it did this whole thing in the background. It created this table review that we looked at now. And I mean we also don't need to look at this, but we can use it as a human in the loop step for verifying outcomes. just get a bit of a better understanding of the underlying data. And um then we come back here. The agent figured out the document categories. It figured out the key parties. There's some interesting red flags here that I can't judge how good they are because I don't know what what is in there and what not.
And uh it also moved all the employment agreements to a specific folder. So if I click in here now, we have this folder here about employment agreements. And it has all the employment agreements in these like hundreds of documents in there. Yeah. So that's it. That's it for the demo. If you can go back to the slides. Great. Okay. To quickly zoom out a bit and um round this off, I think it's very interesting to also think about why coding as a domain is so much ahead in terms of like AI adoption. And there's probably a lot of different reasons for that.
There is for example the fact that engineers are just more willing to try new tools and to like adopt new technology in their work. There might also be the case that coding gets so much focus now because solving coding unlocks a lot of growth in like other pie other niches of of software engineering and you can accelerate progress much quicker by solving it. But uh the cool thing is if you're building any other vertical, you don't really care why it's ahead because you can just keep looking at what coding agent ship and you can reuse what's usable for your domain.
You can translate stuff that's similar but not really the same. And then the last part you actually invent and and come up with for your specific domain agent. And yeah, that's the framework. Uh for any vertical agent, you can just keep looking at coding agents. Whenever they ship something new, you steal the thing and benefit from it. Yeah, that's it for today. Thank you.
More from Claude
Get daily recaps from
Claude
AI-powered summaries delivered to your inbox. Save hours every week while staying fully informed.








