I'll stick with the injection version of the DC (level 1) Ebers-Moll model. It was the first description made and it is more intuitive. So I imagine it may provide a better shot at it. (I will intentionally avoid the microelectronic view.)
injection model
Borrowing the image from this answer, here's the injection version drawn slightly differently:

The two diodes do not, by themselves, completely describe the bipolar transistor. The ultra-thin base region leads to some coupling between the junctions. (As you know, you can't make a bipolar transistor from two discrete diodes.)
Both diodes follow the Shockley diode equation. But they have different saturation current parameters (the dopant concentrations are lower in the collector and higher in the emitter.)
The coupling due to the ultra-thin base region is then modeled by two current-dependent current sources. Each junction diode has a different coupling coefficient: \$\alpha_{\small{F}}=\frac{\beta_{\small{F}}}{1+\beta_{\small{F}}}\$ and \$\alpha_{\small{R}}=\frac{\beta_{\small{R}}}{1+\beta_{\small{R}}}\$.
active mode
Let's start with the active mode case:

The BC junction is reverse-biased, above, and can be neglected. The resulting model is pretty simple. It's a diode and a current generator. The exact mechanism for the current generator is a matter of the microelectronics. But it is required in order to understand the active mode behavior (because of that ultra-thin base region.)
I think you already understand this one reasonably well. So let's drop it and move on.
saturation mode
Now both diodes are forward-biased per saturation mode requirements. Let's look at what happens by adding this new forward-biased BC diode, in parallel with the earlier forward-biased BE diode:

The full Shockley equation diode current, \$I_{\small{R}}\$, for the BC junction happens in the same way and for the same reasons as did \$I_{\small{F}}\$, before.
But the difference now is that \$I_{_\text{C}}\$ is reduced to \$I_{_\text{C}}=\alpha_{\small{F}}\:I_{\small{F}}-\underline{I_{\small{R}}}\$ and \$I_{_\text{E}}\$ is reduced to \$\left| I_{_\text{E}}\right|=I_{\small{F}}-\underline{\alpha_{\small{R}}\:I_{\small{R}}}\$, but \$I_{_\text{B}}\$ is increased to \$I_{_\text{B}}=\left(1-\alpha_{\small{F}}\right)I_{\small{F}}+\underline{\left(1-\alpha_{\small{R}}\right)I_{\small{R}}}\$.
That's the change that occurs.
summary
Let's make some quantitative predictions. That's always the best way to sort crap from useful knowledge.
I will need to know what LTspice uses for \$V_T\$ during a regular run. I can do this by diode-connecting a BJT, driving \$1\:\text{mA}\$ into it, and extracting the result using some values:

So \$V_T= 25.8649606\:\text{mV}\$.
In the above diagram you can see that I used a nearly ideal model in LTspice. But it has different values for the forward and reverse \$\beta\$ (so we can test what I'm saying.)
There's one added problem:
- LTspice uses a modified version of Gummel-Poon, not Ebers-Moll. The closest Ebers-Moll model is the hybrid-\$\pi\$, which only requires one saturation current, not two as does the injection version.
If you look above, you will see that my idealized bipolar model uses IS=10fA. The equivalent two saturation currents in the injection model are \$I_{_\text{ES}}=\frac{I_{_\text{S}}}{\alpha_{\small{F}}}\$ and \$I_{_\text{CS}}=\frac{I_{_\text{S}}}{\alpha_{\small{R}}}\$. I'll be using this information, shortly.
Some terms to define for the injection model use:
- \$\alpha_{\small{F}}=\frac1{1+\frac1{\beta_{\small{F}}}}\$
- \$\alpha_{\small{R}}=\frac1{1+\frac1{\beta_{\small{R}}}}\$
- \$I_{\small{F}}=\frac{I_{_\text{S}}}{\alpha_{\small{F}}}\left[\exp\left(\frac{V_{_\text{BE}}}{V_T}\right)-1\right]\$
- \$I_{\small{R}}=\frac{I_{_\text{S}}}{\alpha_{\small{R}}}\left[\exp\left(\frac{V_{_\text{BE}}-V_{_\text{CE}}}{V_T}\right)-1\right]\$
From the injection model then also find:
- \$I_{_\text{C}}=\alpha_{\small{F}}\,I_{\small{F}}-I_{\small{R}}\approx I_{_\text{S}}\exp\left(\frac{V_{_\text{BE}}}{V_T}\right)\left[1-\frac1{\alpha_{\small{R}}}\exp\left(\frac{-V_{_\text{CE}}}{V_T}\right)\right]\$
- \$I_{_\text{B}}=\left(1-\alpha_{\small{F}}\right)I_{\small{F}}+\left(1-\alpha_{\small{R}}\right)I_{\small{R}}\approx I_{_\text{S}}\exp\left(\frac{V_{_\text{BE}}}{V_T}\right)\left[\frac1{\beta_{\small{F}}}+\frac1{\beta_{\small{R}}}\exp\left(\frac{-V_{_\text{CE}}}{V_T}\right)\right]\$
- \$I_{_\text{E}}=-I_{\small{F}}+\alpha_{\small{R}}\,I_{\small{R}}\approx I_{_\text{S}}\exp\left(\frac{V_{_\text{BE}}}{V_T}\right)\left[\exp\left(\frac{-V_{_\text{CE}}}{V_T}\right)-\frac1{\alpha_{\small{F}}}\right]\$
In SymPy/Python:
VT = 25.8649606e-3 # measured VT
ISAT = 10e-15 # model IS
def A(b): return 1/(1+1/b) # alpha factor
def IC(vbe,vce,vt,isat,br): return isat*exp(vbe/vt)*(1-exp(-vce/vt)/A(br))
def IE(vbe,vce,vt,isat,bf): return isat*exp(vbe/vt)*(exp(-vce/vt)-1/A(bf))
def IB(vbe,vce,vt,isat,bf,br): return isat*exp(vbe/vt)*(1/bf+exp(-vce/vt)/br)
I'm going to use \$V_{_\text{CE}}=100\:\text{mV}\$ and \$V_{_\text{BE}}=650\:\text{mV}\$ as a first test of the above injection model. I believe making \$V_{_\text{CB}}=550\:\text{mV}\$ will be forward-biased enough to matter enough when run also in LTspice, as validation of the above. I'll also use \$\beta_{\small{F}}=200\$ and \$\beta_{\small{R}}=20\$:
sci( IC(.65,.1,VT,ISAT,20), 4 )
'802.4051E-06'
The prediction is \$I_{_\text{C}}\approx 802.4051\:\mu\text{A}\$.
Now let's do an LTspice run:

Now, that is very close!!!
Let's try an extreme test by setting \$V_{_\text{CE}}=0\:\text{V}\$. In this case, the prediction is that the collector current will be negative. This is because the transistion from positive collector current to negative collector current (due to the differences in the two current generators) is at \$V_{_\text{CE}}=V_T\,\ln\left(1+\frac1{\beta_{\small{R}}}\right)=0.00126195567390989\approx 1.262\:\text{mV}\$. Below that value, expect a negative collector current:
sci( IC(.65,0,VT,ISAT,20), 4 )
'-41.0221E-06'
The prediction is in fact negative. It's \$I_{_\text{C}}\approx -41.0221\:\mu\text{A}\$.
Let's see what LTspice says:

Again, this is very close and the sign is right, as well.
Lastly, just to verify that the collector current reaches zero at \$V_{_\text{CE}}=1.26195567390989\:\text{mV}\$:
sci( IC(.65,0.00126195567390989,VT,ISAT,20), 4 )
'-182.1746E-21'
Which is as close to zero as one may wish to see.
LTspice calculates Ic(Q1) = -4.09828e-011.
So this is also a match.
There's another parlor trick, of sorts. Saturation can only be handled by the full 4-quadrant model, unlike active mode which only requires half of it.
Suppose \$V_{_\text{CC}}=3.3\:\text{V}\$ for an MCU and you need to ensure at or below \$V_{_\text{CE}}=100\:\text{mV}\$ with a collector load of \$160\:\text{mA}\$ that will be activated by an I/O pin and an NPN transistor. You can solve for the required \$V_{_\text{BE}}\$:
def VBE(ic,vce,vt,isat,br): return vt*ln(ic/isat/(1-exp(-vce/vt)/A(br)))
float( sci( VBE(0.16,0.1,VT,ISAT,20), 4 ) )
0.7869631 # required VBE
# note: bipolars are voltage-driven, not current driven
float( sci( IB(0.7869631,0.1,VT,ISAT,200,20), 4 ) )
0.0009892429 # required base current
(This would be \$\beta\approx 162\$.)
Remember, the expectation is no more than \$V_{_\text{CE}}=100\:\text{mV}\$, or \$\ge 3.2\:\text{V}\$ across the load at \$160\:\text{mA}\$. So the collector load can be treated as a \$\frac{3.2\:\text{V}}{160\:\text{mA}}=20\:\Omega\$ resistance.
Assume that the I/O pin will yield exactly \$3.3\:\text{V}\$ at \$0\:\text{mA}\$ and has a FET output resistance, when HI, of \$\approx 40\:\Omega\$.
Then it follows that the base resistor should be \$\frac{3.3\:\text{V}-786.9631\:\text{mV}}{989.2429\:\mu\text{A}}-40\:\Omega\approx 2500.364\:\Omega\$. Round that to \$2.5\:\text{k}\Omega\$ and the result should be very close.
I'll run both the exact and approximate values:

Again, quantitatively predictable.
Take note here. Normally, using a bipolar transistor as a switch (in deep saturation) isn't treated as predictable. Base current is usually just taken to be about \$\frac1{10}\$th of the collector current with a \$\beta\approx 10\$ assumption and the analysis stops there. It's easy.
But I want to point out that the full Ebers-Moll model is a powerful model and it just works right, even in saturation. It is only the full Ebers-Moll model that can handle saturation and active mode behavior (and reverse-active and cutoff modes, too), quantitatively. (It also, of course, clearly and precisely demonstrates that the bipolar transistor is a voltage-driven device. Not current-driven.)
It was and still remains a remarkable achievement.
What I've demonstrated is just how useful it is to consider the bipolar transistor as two customized diode models, laying one of them aside the other. And it works in all quadrants, as well.
The injection model was the first model to be disclosed in the literature because, in part, it's the easier model to use in our heads. It's just two simpler diode models (with current generators) laid side-by-side. So if you can put one of them into your head, it's easy enough to then just add the second one. And that is enough to work in all quadrants.
It's problems were:
- two model parameter saturation currents were required, one for each diode section, so simulation required two parameters where one would otherwise suffice (which would ease computation a bit)
- two current generators were required where one would otherwise suffice (which is easier to use in producing a small-signal model for analysis)
The first problem was solved by the fully equivalent transport model. The second problem was solved by the fully equivalent hybrid-\$\pi\$ model.
But the injection model better helps to see how it is that the forward BC diode current makes its way back. Please look it over a little more. Think over the ideas here.
They are all saying the same thing, just in different ways. But the original injection version is perhaps the easier way to see how saturation works.
I've mentioned these current generators due to the ultra-thin base region causing coupling in the bipolar transistor. I've demonstrated that these current generators properly predict the bipolar transistor behavior (at least, at DC and low frequency.)
You can either accept that idea and run with it or else you can demand to dig beyond the model and reach into the microelectroncs (physics) that is taking place inside the bipolar transistor and which can be summarized as "current generators" in Ebers-Moll's models, so well.
If you need that level of knowledge you can pick a textbook, such as any of the editions of:
- "Microelectronics" by Jacob Millman
Or you can read publications that came out early on -- on or before the mid-1960's -- such as the quite excellent volume used primarily for 3rd and 4th year undergraduate studies:
- "Physical Electronics and Circuit Models of Transistors" by Paul E. Gray (MIT), David DeWitt (IBM), A. R. Boothroyd (Queen's Uni of Belfast), James F. Gibbons (Stanford), 1964, vol. 2 of
Semiconductor Electronics Education Committee (SEEC)
Obviously, there are the original papers to also read:
- "Physical Principles Involved in Transistor Action" by J. Bardeen and W. H. Brattain, April 1949
- "The Theory of p-n Junctions in Semiconductors and p-n
Junction Transistors" by W. Shockley, 1949
- "Electrons and Holes in Semiconductors: With Applications to Transistor Electronics" by William Shockley, 1950
- "p-n Junction Transistors" by W. Shockley and G. K. Teal, 1951
- "Some Circuit Properties and Applications of n-p-n Transistors" by R. L. Wallace, Jr. and W. J. Pietenpol, 1951
- "Effects of Space-Charge Layer Widening in Junction Transistors" by J. M. Early, 1952
- "Large-Signal Behavior of Junction Transistors" by J. J. Ebers and J. L. Moll, 1954
- "The Dependence of Transistor Parameters on the Distribution of Base Layer Resistivity" by J. L. Moll, 1956
- "An Integral Charge Control Model of Bipolar Transistors" by H. K. Gummel and H. C. Poon, 1970
- "Direct Verification of the Ebers-Moll Reciprocity Condition" by B. L. Hart, 1971
I have copies (and have read them with a small bit of understanding) of all of the above papers and volumes, indicated above. I think the physics is best summarized in the Ebers-Moll model (or the Gummel-Poon model) and you can save yourself some reading by just accepting the ideas as granted.
(The MEXTRAM model is the most modern model. But I've not studied it in any detail. Nor have I read any of the original science publications related to it. So I don't have much to add about it.)