33<div class =" sans-text " >
44
55** Joshua Levy** ([ GitHub] ( https://github.com/jlevy ) , [ Twitter] ( https://x.com/ojoshe ) )\
6- * v0.1.7 (June 2025) – Draft!*
6+ * v0.1.8 (July 2025) – Draft!*
77
88<div class =" boxed-text " >
99
10- This is a rough draft.
11- I’d like to revise this as I get feedback and I’d be grateful if you share your thoughts
12- on any part, especially skepticism or disagreement.
13- The ideas here come from many people, no doubt far more than I have currently cited.
14- I would like to include more references and credits where they are due.
15- If you know relevant work or if your own work is relevant, please let me know so I can
16- reference it.
10+ This is a draft. I’d be grateful if you share your thoughts on any part, especially
11+ skepticism or disagreement, as I revise it.
1712
1813The fastest way to reach me is to tag or DM me: [ x.com/ojoshe] ( https://x.com/ojoshe )
1914
2015</div >
2116
2217</div >
2318
19+ ## Acknoledgements
20+
21+ I’m grateful to many people who have discussed key ideas or who have given me feedback,
22+ including Adam Cheyer, Amina Green, Sean Grove, Carlos E. Perez, and Russell Power.
23+
24+ I would like to include more references and credits where they are due.
25+ If you know relevant work or if your own work is relevant, please let me know so I can
26+ reference it.
27+
2428## Introduction
2529
2630This work is a collection of reflections and research ideas on software engineering and
@@ -258,10 +262,10 @@ software) to human problems.*
258262
259263### What is Good Engineering?
260264
261- Another fundamental question is, why does it seem like software engineering is quite
262- hard, both for humans and now for LLMs?
265+ Another fundamental question is, why is software engineering so hard, both for humans
266+ and now for LLMs?
263267
264- I’d like to say the answer is simple, but it’s not: the answer is * complexity* .
268+ The answer, of course, is * complexity* .
265269
266270For better or worse, it’s a fact that humans have varied and insatiable desires.
267271And our desires arise not just from our own needs or imagination, but out of what we see
@@ -434,7 +438,7 @@ and know how to build it*. It’s worth being aware of the key ones:
434438
4354394 . * Innovator’s Dilemma* : They understand and have the resources, but to do so seems
436440 irrational because it is * relatively* less profitable than working on existing,
437- successful products and features
441+ successful products and features and features
438442
4394435 . * Conway’s Law* : They understand the problem and resources and incentives to solve the
440444 problem, but still don’t solve it (or solve it poorly or do something else).
@@ -567,15 +571,15 @@ natural language) and **precision** (how exact the descriptions are).*
567571
568572</div >
569573
570- ### Exact and Inexact Processes
574+ ### Counting Nines
571575
572576Experienced designers and engineers sometimes talk about counting nines.
573577This is because of a fundamental fact:
574578
575579<div class =" boxed-text " >
576580
577581* The design of products is fundamentally shaped by the ** cost of errors** : whether they
578- need to work 90%, 99%, or 99.999% of the time.*
582+ need to work 90%, 99%, or 99.999% of the time.* [ ^ uptimevscorrect ]
579583
580584</div >
581585
@@ -586,35 +590,100 @@ Consider these three systems:
586590
5875912 . The online messaging system you use to write to your physician about it
588592
589- 3 . The software and process the pharmacisst uses to fill your prescription bottle with
593+ 3 . The software and process the pharmacist uses to fill your prescription bottle with
590594 the correct pill
591595
592596The first can tolerate a 1–10% error rate since you’re reading many things and will form
593597your own opinions. The second can tolerate perhaps a 1% error rate.
594- The third has very low tolerance for error, perhaps a 0.01 % error rate or less.
598+ The third has very low tolerance for error, perhaps a 0.001 % error rate or less.
595599
596600It is certain that each of these systems could improve with machine learning and AI. But
597601if you want to add AI or machine learning to each of those use cases, the approach needs
598602to be very different.
599603You can’t get away from the fact that some systems need high reliability and precision.
600- In other cases, we can adapt to errors because the cost is of errors is low.
604+ In other cases, we can adapt to errors because the cost of errors is low.
605+
606+ [ ^ uptimevscorrect ] : In engineering, “counting nines” traditionally refers to * service
607+ availability* (uptime): 99.99% uptime means less than 52 minutes of downtime per
608+ year. However, the concept equally applies to * correctness nines* for critical
609+ processes.
610+
611+ ### Exact and Inexact Processes
612+
613+ Most complex human endeavors can be viewed as a ** socio-technical system** that involves
614+ multiple people, tools, and written or unwritten processes.
615+ Many activities are informal or ad-hoc, and some are more repeatable.
616+
617+ As systems and products mature, processes are codified and automated for efficiency and
618+ consistency.
619+ And since software began “eating the world,” many of the tools and processes
620+ are implemented in software.
621+
622+ For example, in a mature software business, there is a process for how software is
623+ deployed as well as processes for customer support.
624+ In an accounting department, there is a process for calculating revenue as well a
625+ process for auditing financial statements.
626+
627+ Let’s call standard processes like this ** procedures** .
601628
602- As we’ve seen, specification in English and code in Python have similar purpose but not
603- interchangeable. For some situations, exact descriptions are of fundamental and
604- unavoidable importance.
605- For others, convenience and low cost are more important.
606- For almost any project, team, or company, both exact and inexact processes are needed.
629+ Procedures are not all the same.
630+ Some procedures are exact.
631+ When software is deployed, the program that users access should be * exactly* the same
632+ program that the developers tested.
633+ When an Excel spreadsheet calculates the revenue for the month, it should be * exactly*
634+ the same calculation performed last month.
635+
636+ Other procedures are inexact.
637+ Even the most codified customer support processes are not exact.
638+ They require best-effort judgement in certain cases.
639+ The same is true for auditing financial statements.
640+ The decisions may often be clear-cut but there will inevitably be situational gray areas
641+ that need ongoing resolution.
642+
643+ <div class =" boxed-text " >
644+
645+ Socio-technical systems require a combination of both ** exact** and ** inexact**
646+ procedures.
647+
648+ Exact and inexact procedures are qualitatively different and not interchangeable.
649+
650+ </div >
651+
652+ Imagine that you have code that performs a task, such as use an API to get your sales
653+ data from your payment processor.
654+ And you have certain pieces of code to calculate customized monthly sales metrics.
655+ It’s easy to combine these into reporting software that does both.
656+
657+ This is how software is built: by ** composition** . A typical application is the
658+ composition of dozens or hundreds of layers like this.
659+
660+ That is the power of software encoding exact procedures.
661+ However, inexact procedures are different.
662+
663+ Imagine you have a written procedure to handle customer support emails.
664+ And you have a written procedure to handle billing questions.
607665
608666<div class =" boxed-text " >
609667
610- Socio-technical systems are built from a combination of both ** exact** and ** inexact**
611- processes that are qualitatively different and not interchangeable.
668+ Exact procedures can be composed easily: combine them and the result is an exact
669+ procedure.
670+
671+ Inexact procedures can only be composed with care: combine several naively and the
672+ result is likely to be useless.
612673
613674</div >
614675
676+ I’m not saying you can never compose inexact procedures.
677+ In fact, you often need to compose them a lot.
678+
679+ It’s just that because each one has its own corner cases and failures, you can’t
680+ unthinkingly combine them and expect the resulting combination to work!
681+ This makes the design of systems of inexact processes qualitatively different from pure
682+ software engineering.
683+
615684### Is English the New Programming Language?
616685
617- Engineers are now writing more and software just by using English and natural-language
686+ Engineers are now writing more software just by using English and natural-language
618687specifications and docs.
619688Developers of Claude Code now say most (perhaps even 90%) of Claude Code’s own codebase
620689is now written in Claude Code.
@@ -626,22 +695,74 @@ compilers. (On Twitter, and you’ll see similar statements almost every day.)
626695I hope by now you’d agree with me that this isn’t the most insightful way to frame what
627696is happening.
628697
629- English in the form we normally use it is far too ambiguous and imprecise to express the
630- behavior of software with full precision.
631- We can use English in * many* ways with LLMs, from writing poetry to brainstorming
632- software architecture.
633- Some of these forms directly support engineering.
634- Others do not.
698+ Specifications in English and code in Python may have similar purpose but are not
699+ interchangeable. English is too ambiguous and imprecise for exact procedures.
700+
701+ This is a fundamental issue, with both natural language and the development of LLMs.
702+ Even if an LLM correctly interprets an instruction 99.9% of the time today, it’s a
703+ significant challenge to have a process that ensures the updated LLM you will be using
704+ next month will do the same thing 99.9% of the time.
705+
706+ <div class =" boxed-text " >
707+
708+ English is ideal for documenting inexact procedures.
709+
710+ Code (of some form) is ideal for documenting exact procedures.
711+
712+ </div >
635713
636- It’s more accurate to say we are seeing three related but different changes:
714+ There is an interesting nuance, however: Could we think of English as a program, but use
715+ LLMs to “compile” English to code of some form, and then review and use that code?
716+ Well, yes! That’s what we are already doing when we use LLMs to code.
717+
718+ The only difference is right now we tend to only save or version control the code.
719+ Increasingly, we may wish to version control the spec that led to the code, as well, so
720+ we can streamline the update process of “re-compiling,” reviewing, and testing the code.
721+
722+ ### Automation of Inexact Procedures
723+
724+ Traditionally, we have used code for exact procedures.
725+ And we’ve used people and documents for inexact procedures.
726+
727+ <div class =" boxed-text " >
728+
729+ * LLMs are as good as (or better) than humans for inexact procedures.*
730+
731+ </div >
732+
733+ <div class =" boxed-text " >
734+
735+ * Inexact procedures will increasingly become automated with natural language
736+ specifications shared by both humans and LLMs tools.*
737+
738+ </div >
739+
740+ A lot of confusion arises from confusing the * automation* of LLMs with the * exactness*
741+ of code. I hope I’ve now convinced you of this:
742+
743+ <div class =" boxed-text " >
744+
745+ * LLMs can automate inexact procedures but automation does not make an inexact procedure
746+ an exact procedure.*
747+
748+ </div >
749+
750+ In other words, LLMs offer automation, but they don’t eliminate the prime importance of
751+ code to define software behavior.
752+
753+ ### Software Engineering with LLMs
754+
755+ Now let’s think about software engineering again.
756+ If we accept that human engineers still need to work with code, what is changing?
757+ I think of three key areas:
637758
6387591 . ** Faster coding with LLM tools:** Engineers can code more quickly by using English
639760 with LLMs and LLM-powered agents to read, write, and test code more quickly than ever
640- before (e.g., vibe coding prototypes)
761+ before (e.g., “ vibe coding” prototypes)
641762
6427632 . ** Broader capabilities of human workers:** Because a single person can build and test
643764 software more quickly, they can rapidly make experiments; in fact, they can
644- effectively broaden their roles (e.g., a desinger coding in HTML/CSS instead of
765+ effectively broaden their roles (e.g., a designer coding in HTML/CSS instead of
645766 waiting for a frontend engineer to convert Figma to code)
646767
6477683 . ** Software processes described in natural language:** We are seeing the emergence of
@@ -653,16 +774,6 @@ one where clear and precise English is essential to working efficiently with our
653774and teammates. And we’ll work with this * and* traditional programming language code with
654775formal semantics.
655776
656- <div class =" boxed-text " >
657-
658- *** Inexact processes** previously mostly handled by humans will increasingly become
659- automated with the use natural language specifications shared by both humans and LLMs
660- tools.*
661-
662- *** Exact processes** will still have code as the primary description of their behavior.*
663-
664- </div >
665-
666777LLMs are pushing us toward precise and structured ways of expressing software that are
667778more human friendly.
668779It’s quite likely we’ll see emergence of languages at new points between English and
@@ -924,7 +1035,7 @@ paths in the grass, then pave them.
9241035### Principles for Compositional Tools
9251036
9261037In software, big ideas often must be realized from practical, concrete pieces.
927- So I think it makes sense to work “bottom up”.
1038+ So I think it makes sense to work “bottom up.”
9281039
9291040So what are some areas where we could begin to build better primitive operations?
9301041
0 commit comments