Numerical Analysis and Innovative Simulation
Techniques for Designing Advanced MRAM

\(\newcommand{\footnotename}{footnote}\) \(\def \LWRfootnote {1}\) \(\newcommand {\footnote }[2][\LWRfootnote ]{{}^{\mathrm {#1}}}\) \(\newcommand {\footnotemark }[1][\LWRfootnote ]{{}^{\mathrm {#1}}}\) \(\let \LWRorighspace \hspace \) \(\renewcommand {\hspace }{\ifstar \LWRorighspace \LWRorighspace }\) \(\newcommand {\TextOrMath }[2]{#2}\) \(\newcommand {\mathnormal }[1]{{#1}}\) \(\newcommand \ensuremath [1]{#1}\) \(\newcommand {\LWRframebox }[2][]{\fbox {#2}} \newcommand {\framebox }[1][]{\LWRframebox } \) \(\newcommand {\setlength }[2]{}\) \(\newcommand {\addtolength }[2]{}\) \(\newcommand {\setcounter }[2]{}\) \(\newcommand {\addtocounter }[2]{}\) \(\newcommand {\arabic }[1]{}\) \(\newcommand {\number }[1]{}\) \(\newcommand {\noalign }[1]{\text {#1}\notag \\}\) \(\newcommand {\cline }[1]{}\) \(\newcommand {\directlua }[1]{\text {(directlua)}}\) \(\newcommand {\luatexdirectlua }[1]{\text {(directlua)}}\) \(\newcommand {\protect }{}\) \(\def \LWRabsorbnumber #1 {}\) \(\def \LWRabsorbquotenumber "#1 {}\) \(\newcommand {\LWRabsorboption }[1][]{}\) \(\newcommand {\LWRabsorbtwooptions }[1][]{\LWRabsorboption }\) \(\def \mathchar {\ifnextchar "\LWRabsorbquotenumber \LWRabsorbnumber }\) \(\def \mathcode #1={\mathchar }\) \(\let \delcode \mathcode \) \(\let \delimiter \mathchar \) \(\def \oe {\unicode {x0153}}\) \(\def \OE {\unicode {x0152}}\) \(\def \ae {\unicode {x00E6}}\) \(\def \AE {\unicode {x00C6}}\) \(\def \aa {\unicode {x00E5}}\) \(\def \AA {\unicode {x00C5}}\) \(\def \o {\unicode {x00F8}}\) \(\def \O {\unicode {x00D8}}\) \(\def \l {\unicode {x0142}}\) \(\def \L {\unicode {x0141}}\) \(\def \ss {\unicode {x00DF}}\) \(\def \SS {\unicode {x1E9E}}\) \(\def \dag {\unicode {x2020}}\) \(\def \ddag {\unicode {x2021}}\) \(\def \P {\unicode {x00B6}}\) \(\def \copyright {\unicode {x00A9}}\) \(\def \pounds {\unicode {x00A3}}\) \(\let \LWRref \ref \) \(\renewcommand {\ref }{\ifstar \LWRref \LWRref }\) \( \newcommand {\multicolumn }[3]{#3}\) \(\require {textcomp}\) \(\newcommand {\mathlarger }[1]{#1}\) \(\newcommand {\mathsmaller }[1]{#1}\) \(\newcommand {\intertext }[1]{\text {#1}\notag \\}\) \(\let \Hat \hat \) \(\let \Check \check \) \(\let \Tilde \tilde \) \(\let \Acute \acute \) \(\let \Grave \grave \) \(\let \Dot \dot \) \(\let \Ddot \ddot \) \(\let \Breve \breve \) \(\let \Bar \bar \) \(\let \Vec \vec \) \(\newcommand {\toprule }[1][]{\hline }\) \(\let \midrule \toprule \) \(\let \bottomrule \toprule \) \(\def \LWRbooktabscmidruleparen (#1)#2{}\) \(\newcommand {\LWRbooktabscmidrulenoparen }[1]{}\) \(\newcommand {\cmidrule }[1][]{\ifnextchar (\LWRbooktabscmidruleparen \LWRbooktabscmidrulenoparen }\) \(\newcommand {\morecmidrules }{}\) \(\newcommand {\specialrule }[3]{\hline }\) \(\newcommand {\addlinespace }[1][]{}\) \(\newcommand {\bm }[1]{\boldsymbol {#1}}\) \(\newcommand {\LWRsubmultirow }[2][]{#2}\) \(\newcommand {\LWRmultirow }[2][]{\LWRsubmultirow }\) \(\newcommand {\multirow }[2][]{\LWRmultirow }\) \(\newcommand {\mrowcell }{}\) \(\newcommand {\mcolrowcell }{}\) \(\newcommand {\STneed }[1]{}\) \(\newcommand {\tothe }[1]{^{#1}}\) \(\newcommand {\raiseto }[2]{{#2}^{#1}}\) \(\newcommand {\LWRsiunitxEND }{}\) \(\def \LWRsiunitxang #1;#2;#3;#4\LWRsiunitxEND {\ifblank {#1}{}{\num {#1}\degree }\ifblank {#2}{}{\num {#2}^{\unicode {x2032}}}\ifblank {#3}{}{\num {#3}^{\unicode {x2033}}}}\) \(\newcommand {\ang }[2][]{\LWRsiunitxang #2;;;\LWRsiunitxEND }\) \(\def \LWRsiunitxdistribunit {}\) \(\newcommand {\LWRsiunitxENDTWO }{}\) \(\def \LWRsiunitxprintdecimalsubtwo #1,#2,#3\LWRsiunitxENDTWO {\ifblank {#1}{0}{\mathrm {#1}}\ifblank {#2}{}{{\LWRsiunitxdecimal }\mathrm {#2}}}\) \(\def \LWRsiunitxprintdecimalsub #1.#2.#3\LWRsiunitxEND {\LWRsiunitxprintdecimalsubtwo #1,,\LWRsiunitxENDTWO \ifblank {#2}{}{{\LWRsiunitxdecimal }\LWRsiunitxprintdecimalsubtwo #2,,\LWRsiunitxENDTWO }}\) \(\newcommand {\LWRsiunitxprintdecimal }[1]{\LWRsiunitxprintdecimalsub #1...\LWRsiunitxEND }\) \(\def \LWRsiunitxnumplus #1+#2+#3\LWRsiunitxEND {\ifblank {#2}{\LWRsiunitxprintdecimal {#1}}{\ifblank {#1}{\LWRsiunitxprintdecimal {#2}}{\LWRsiunitxprintdecimal {#1}\unicode {x02B}\LWRsiunitxprintdecimal {#2}}}\LWRsiunitxdistribunit }\) \(\def \LWRsiunitxnumminus #1-#2-#3\LWRsiunitxEND {\ifblank {#2}{\LWRsiunitxnumplus #1+++\LWRsiunitxEND }{\ifblank {#1}{}{\LWRsiunitxprintdecimal {#1}}\unicode {x02212}\LWRsiunitxprintdecimal {#2}\LWRsiunitxdistribunit }}\) \(\def \LWRsiunitxnumpmmacro #1\pm #2\pm #3\LWRsiunitxEND {\ifblank {#2}{\LWRsiunitxnumminus #1---\LWRsiunitxEND }{\LWRsiunitxprintdecimal {#1}\unicode {x0B1}\LWRsiunitxprintdecimal {#2}\LWRsiunitxdistribunit }}\) \(\def \LWRsiunitxnumpm #1+-#2+-#3\LWRsiunitxEND {\ifblank {#2}{\LWRsiunitxnumpmmacro #1\pm \pm \pm \LWRsiunitxEND }{\LWRsiunitxprintdecimal {#1}\unicode {x0B1}\LWRsiunitxprintdecimal {#2}\LWRsiunitxdistribunit }}\) \(\newcommand {\LWRsiunitxnumscientific }[2]{\ifblank {#1}{}{\ifstrequal {#1}{-}{-}{\LWRsiunitxprintdecimal {#1}\times }}10^{\LWRsiunitxprintdecimal {#2}}\LWRsiunitxdistribunit }\) \(\def \LWRsiunitxnumD #1D#2D#3\LWRsiunitxEND {\ifblank {#2}{\LWRsiunitxnumpm #1+-+-\LWRsiunitxEND }{\mathrm {\LWRsiunitxnumscientific {#1}{#2}}}}\) \(\def \LWRsiunitxnumd #1d#2d#3\LWRsiunitxEND {\ifblank {#2}{\LWRsiunitxnumD #1DDD\LWRsiunitxEND }{\mathrm {\LWRsiunitxnumscientific {#1}{#2}}}}\) \(\def \LWRsiunitxnumE #1E#2E#3\LWRsiunitxEND {\ifblank {#2}{\LWRsiunitxnumd #1ddd\LWRsiunitxEND }{\mathrm {\LWRsiunitxnumscientific {#1}{#2}}}}\) \(\def \LWRsiunitxnume #1e#2e#3\LWRsiunitxEND {\ifblank {#2}{\LWRsiunitxnumE #1EEE\LWRsiunitxEND }{\mathrm {\LWRsiunitxnumscientific {#1}{#2}}}}\) \(\def \LWRsiunitxnumx #1x#2x#3x#4\LWRsiunitxEND {\ifblank {#2}{\LWRsiunitxnume #1eee\LWRsiunitxEND }{\ifblank {#3}{\LWRsiunitxnume #1eee\LWRsiunitxEND \times \LWRsiunitxnume #2eee\LWRsiunitxEND }{\LWRsiunitxnume #1eee\LWRsiunitxEND \times \LWRsiunitxnume #2eee\LWRsiunitxEND \times \LWRsiunitxnume #3eee\LWRsiunitxEND }}}\) \(\newcommand {\num }[2][]{\LWRsiunitxnumx #2xxxxx\LWRsiunitxEND }\) \(\newcommand {\si }[2][]{\mathrm {\gsubstitute {#2}{~}{\,}}}\) \(\def \LWRsiunitxSIopt #1[#2]#3{\def \LWRsiunitxdistribunit {\,\si {#3}}{#2}\num {#1}\def \LWRsiunitxdistribunit {}}\) \(\newcommand {\LWRsiunitxSI }[2]{\def \LWRsiunitxdistribunit {\,\si {#2}}\num {#1}\def \LWRsiunitxdistribunit {}}\) \(\newcommand {\SI }[2][]{\ifnextchar [{\LWRsiunitxSIopt {#2}}{\LWRsiunitxSI {#2}}}\) \(\newcommand {\numlist }[2][]{\text {#2}}\) \(\newcommand {\numrange }[3][]{\num {#2}\ \LWRsiunitxrangephrase \ \num {#3}}\) \(\newcommand {\SIlist }[3][]{\text {#2}\,\si {#3}}\) \(\newcommand {\SIrange }[4][]{\num {#2}\,#4\ \LWRsiunitxrangephrase \ \num {#3}\,#4}\) \(\newcommand {\tablenum }[2][]{\mathrm {#2}}\) \(\newcommand {\ampere }{\mathrm {A}}\) \(\newcommand {\candela }{\mathrm {cd}}\) \(\newcommand {\kelvin }{\mathrm {K}}\) \(\newcommand {\kilogram }{\mathrm {kg}}\) \(\newcommand {\metre }{\mathrm {m}}\) \(\newcommand {\mole }{\mathrm {mol}}\) \(\newcommand {\second }{\mathrm {s}}\) \(\newcommand {\becquerel }{\mathrm {Bq}}\) \(\newcommand {\degreeCelsius }{\unicode {x2103}}\) \(\newcommand {\coulomb }{\mathrm {C}}\) \(\newcommand {\farad }{\mathrm {F}}\) \(\newcommand {\gray }{\mathrm {Gy}}\) \(\newcommand {\hertz }{\mathrm {Hz}}\) \(\newcommand {\henry }{\mathrm {H}}\) \(\newcommand {\joule }{\mathrm {J}}\) \(\newcommand {\katal }{\mathrm {kat}}\) \(\newcommand {\lumen }{\mathrm {lm}}\) \(\newcommand {\lux }{\mathrm {lx}}\) \(\newcommand {\newton }{\mathrm {N}}\) \(\newcommand {\ohm }{\mathrm {\Omega }}\) \(\newcommand {\pascal }{\mathrm {Pa}}\) \(\newcommand {\radian }{\mathrm {rad}}\) \(\newcommand {\siemens }{\mathrm {S}}\) \(\newcommand {\sievert }{\mathrm {Sv}}\) \(\newcommand {\steradian }{\mathrm {sr}}\) \(\newcommand {\tesla }{\mathrm {T}}\) \(\newcommand {\volt }{\mathrm {V}}\) \(\newcommand {\watt }{\mathrm {W}}\) \(\newcommand {\weber }{\mathrm {Wb}}\) \(\newcommand {\day }{\mathrm {d}}\) \(\newcommand {\degree }{\mathrm {^\circ }}\) \(\newcommand {\hectare }{\mathrm {ha}}\) \(\newcommand {\hour }{\mathrm {h}}\) \(\newcommand {\litre }{\mathrm {l}}\) \(\newcommand {\liter }{\mathrm {L}}\) \(\newcommand {\arcminute }{^\prime }\) \(\newcommand {\minute }{\mathrm {min}}\) \(\newcommand {\arcsecond }{^{\prime \prime }}\) \(\newcommand {\tonne }{\mathrm {t}}\) \(\newcommand {\astronomicalunit }{au}\) \(\newcommand {\atomicmassunit }{u}\) \(\newcommand {\bohr }{\mathit {a}_0}\) \(\newcommand {\clight }{\mathit {c}_0}\) \(\newcommand {\dalton }{\mathrm {D}_\mathrm {a}}\) \(\newcommand {\electronmass }{\mathit {m}_{\mathrm {e}}}\) \(\newcommand {\electronvolt }{\mathrm {eV}}\) \(\newcommand {\elementarycharge }{\mathit {e}}\) \(\newcommand {\hartree }{\mathit {E}_{\mathrm {h}}}\) \(\newcommand {\planckbar }{\mathit {\unicode {x210F}}}\) \(\newcommand {\angstrom }{\mathrm {\unicode {x212B}}}\) \(\let \LWRorigbar \bar \) \(\newcommand {\bar }{\mathrm {bar}}\) \(\newcommand {\barn }{\mathrm {b}}\) \(\newcommand {\bel }{\mathrm {B}}\) \(\newcommand {\decibel }{\mathrm {dB}}\) \(\newcommand {\knot }{\mathrm {kn}}\) \(\newcommand {\mmHg }{\mathrm {mmHg}}\) \(\newcommand {\nauticalmile }{\mathrm {M}}\) \(\newcommand {\neper }{\mathrm {Np}}\) \(\newcommand {\yocto }{\mathrm {y}}\) \(\newcommand {\zepto }{\mathrm {z}}\) \(\newcommand {\atto }{\mathrm {a}}\) \(\newcommand {\femto }{\mathrm {f}}\) \(\newcommand {\pico }{\mathrm {p}}\) \(\newcommand {\nano }{\mathrm {n}}\) \(\newcommand {\micro }{\mathrm {\unicode {x00B5}}}\) \(\newcommand {\milli }{\mathrm {m}}\) \(\newcommand {\centi }{\mathrm {c}}\) \(\newcommand {\deci }{\mathrm {d}}\) \(\newcommand {\deca }{\mathrm {da}}\) \(\newcommand {\hecto }{\mathrm {h}}\) \(\newcommand {\kilo }{\mathrm {k}}\) \(\newcommand {\mega }{\mathrm {M}}\) \(\newcommand {\giga }{\mathrm {G}}\) \(\newcommand {\tera }{\mathrm {T}}\) \(\newcommand {\peta }{\mathrm {P}}\) \(\newcommand {\exa }{\mathrm {E}}\) \(\newcommand {\zetta }{\mathrm {Z}}\) \(\newcommand {\yotta }{\mathrm {Y}}\) \(\newcommand {\percent }{\mathrm {\%}}\) \(\newcommand {\meter }{\mathrm {m}}\) \(\newcommand {\metre }{\mathrm {m}}\) \(\newcommand {\gram }{\mathrm {g}}\) \(\newcommand {\kg }{\kilo \gram }\) \(\newcommand {\of }[1]{_{\mathrm {#1}}}\) \(\newcommand {\squared }{^2}\) \(\newcommand {\square }[1]{\mathrm {#1}^2}\) \(\newcommand {\cubed }{^3}\) \(\newcommand {\cubic }[1]{\mathrm {#1}^3}\) \(\newcommand {\per }{\,\mathrm {/}}\) \(\newcommand {\celsius }{\unicode {x2103}}\) \(\newcommand {\fg }{\femto \gram }\) \(\newcommand {\pg }{\pico \gram }\) \(\newcommand {\ng }{\nano \gram }\) \(\newcommand {\ug }{\micro \gram }\) \(\newcommand {\mg }{\milli \gram }\) \(\newcommand {\g }{\gram }\) \(\newcommand {\kg }{\kilo \gram }\) \(\newcommand {\amu }{\mathrm {u}}\) \(\newcommand {\pm }{\pico \metre }\) \(\newcommand {\nm }{\nano \metre }\) \(\newcommand {\um }{\micro \metre }\) \(\newcommand {\mm }{\milli \metre }\) \(\newcommand {\cm }{\centi \metre }\) \(\newcommand {\dm }{\deci \metre }\) \(\newcommand {\m }{\metre }\) \(\newcommand {\km }{\kilo \metre }\) \(\newcommand {\as }{\atto \second }\) \(\newcommand {\fs }{\femto \second }\) \(\newcommand {\ps }{\pico \second }\) \(\newcommand {\ns }{\nano \second }\) \(\newcommand {\us }{\micro \second }\) \(\newcommand {\ms }{\milli \second }\) \(\newcommand {\s }{\second }\) \(\newcommand {\fmol }{\femto \mol }\) \(\newcommand {\pmol }{\pico \mol }\) \(\newcommand {\nmol }{\nano \mol }\) \(\newcommand {\umol }{\micro \mol }\) \(\newcommand {\mmol }{\milli \mol }\) \(\newcommand {\mol }{\mol }\) \(\newcommand {\kmol }{\kilo \mol }\) \(\newcommand {\pA }{\pico \ampere }\) \(\newcommand {\nA }{\nano \ampere }\) \(\newcommand {\uA }{\micro \ampere }\) \(\newcommand {\mA }{\milli \ampere }\) \(\newcommand {\A }{\ampere }\) \(\newcommand {\kA }{\kilo \ampere }\) \(\newcommand {\ul }{\micro \litre }\) \(\newcommand {\ml }{\milli \litre }\) \(\newcommand {\l }{\litre }\) \(\newcommand {\hl }{\hecto \litre }\) \(\newcommand {\uL }{\micro \liter }\) \(\newcommand {\mL }{\milli \liter }\) \(\newcommand {\L }{\liter }\) \(\newcommand {\hL }{\hecto \liter }\) \(\newcommand {\mHz }{\milli \hertz }\) \(\newcommand {\Hz }{\hertz }\) \(\newcommand {\kHz }{\kilo \hertz }\) \(\newcommand {\MHz }{\mega \hertz }\) \(\newcommand {\GHz }{\giga \hertz }\) \(\newcommand {\THz }{\tera \hertz }\) \(\newcommand {\mN }{\milli \newton }\) \(\newcommand {\N }{\newton }\) \(\newcommand {\kN }{\kilo \newton }\) \(\newcommand {\MN }{\mega \newton }\) \(\newcommand {\Pa }{\pascal }\) \(\newcommand {\kPa }{\kilo \pascal }\) \(\newcommand {\MPa }{\mega \pascal }\) \(\newcommand {\GPa }{\giga \pascal }\) \(\newcommand {\mohm }{\milli \ohm }\) \(\newcommand {\kohm }{\kilo \ohm }\) \(\newcommand {\Mohm }{\mega \ohm }\) \(\newcommand {\pV }{\pico \volt }\) \(\newcommand {\nV }{\nano \volt }\) \(\newcommand {\uV }{\micro \volt }\) \(\newcommand {\mV }{\milli \volt }\) \(\newcommand {\V }{\volt }\) \(\newcommand {\kV }{\kilo \volt }\) \(\newcommand {\W }{\watt }\) \(\newcommand {\uW }{\micro \watt }\) \(\newcommand {\mW }{\milli \watt }\) \(\newcommand {\kW }{\kilo \watt }\) \(\newcommand {\MW }{\mega \watt }\) \(\newcommand {\GW }{\giga \watt }\) \(\newcommand {\J }{\joule }\) \(\newcommand {\uJ }{\micro \joule }\) \(\newcommand {\mJ }{\milli \joule }\) \(\newcommand {\kJ }{\kilo \joule }\) \(\newcommand {\eV }{\electronvolt }\) \(\newcommand {\meV }{\milli \electronvolt }\) \(\newcommand {\keV }{\kilo \electronvolt }\) \(\newcommand {\MeV }{\mega \electronvolt }\) \(\newcommand {\GeV }{\giga \electronvolt }\) \(\newcommand {\TeV }{\tera \electronvolt }\) \(\newcommand {\kWh }{\kilo \watt \hour }\) \(\newcommand {\F }{\farad }\) \(\newcommand {\fF }{\femto \farad }\) \(\newcommand {\pF }{\pico \farad }\) \(\newcommand {\K }{\mathrm {K}}\) \(\newcommand {\dB }{\mathrm {dB}}\) \(\newcommand {\kibi }{\mathrm {Ki}}\) \(\newcommand {\mebi }{\mathrm {Mi}}\) \(\newcommand {\gibi }{\mathrm {Gi}}\) \(\newcommand {\tebi }{\mathrm {Ti}}\) \(\newcommand {\pebi }{\mathrm {Pi}}\) \(\newcommand {\exbi }{\mathrm {Ei}}\) \(\newcommand {\zebi }{\mathrm {Zi}}\) \(\newcommand {\yobi }{\mathrm {Yi}}\) \(\let \unit \si \) \(\let \qty \SI \) \(\let \qtylist \SIlist \) \(\let \qtyrange \SIrange \) \(\let \numproduct \num \) \(\let \qtyproduct \SI \) \(\let \complexnum \num \) \(\newcommand {\complexqty }[3][]{(\complexnum {#2})\si {#3}}\) \(\require {mathtools}\) \(\newcommand {\vcentcolon }{\mathrel {\unicode {x2236}}}\) \(\newcommand {\approxcolon }{\approx \vcentcolon }\) \(\newcommand {\Approxcolon }{\approx \dblcolon }\) \(\newcommand {\simcolon }{\sim \vcentcolon }\) \(\newcommand {\Simcolon }{\sim \dblcolon }\) \(\newcommand {\dashcolon }{\mathrel {-}\vcentcolon }\) \(\newcommand {\Dashcolon }{\mathrel {-}\dblcolon }\) \(\newcommand {\colondash }{\vcentcolon \mathrel {-}}\) \(\newcommand {\Colondash }{\dblcolon \mathrel {-}}\) \(\newenvironment {crampedsubarray}[1]{}{}\) \(\newcommand {\smashoperator }[2][]{#2\limits }\) \(\newcommand {\SwapAboveDisplaySkip }{}\) \(\newcommand {\LaTeXunderbrace }[1]{\underbrace {#1}}\) \(\newcommand {\LaTeXoverbrace }[1]{\overbrace {#1}}\) \(\Newextarrow \xLongleftarrow {10,10}{0x21D0}\) \(\Newextarrow \xLongrightarrow {10,10}{0x21D2}\) \(\let \xlongleftarrow \xleftarrow \) \(\let \xlongrightarrow \xrightarrow \) \(\newcommand {\LWRmultlined }[1][]{\begin {multline*}}\) \(\newenvironment {multlined}[1][]{\LWRmultlined }{\end {multline*}}\) \(\let \LWRorigshoveleft \shoveleft \) \(\renewcommand {\shoveleft }[1][]{\LWRorigshoveleft }\) \(\let \LWRorigshoveright \shoveright \) \(\renewcommand {\shoveright }[1][]{\LWRorigshoveright }\) \(\newcommand {\shortintertext }[1]{\text {#1}\notag \\}\) \(\def \LWRsiunitxrangephrase {\TextOrMath { }{\ }\protect \mbox {to}\TextOrMath { }{\ }}\) \(\def \LWRsiunitxdecimal {.}\)

6.8 Preconditioner Evaluation

As established in Section 5.4.1, the inclusion of spin-transfer torque requires a preconditioner that captures the dominant Jacobian contributions. For the BDF method, all terms, including spin-transfer torque, are treated implicitly. Consequently, the preconditioner must incorporate the exchange-damping contribution and the linearized spin-torque terms to remain effective during switching transients.

BDF Preconditioner: Following the mass-lumped formulation of Section 5.4.1, the BDF preconditioner takes the form:

\begin{equation} \mathbf {P}_{\text {BDF}} = \mathbf {I} + \gamma \alpha C_{\text {ex}} \mathbf {M}_L^{-1} \mathbf {K}_{\text {diff}} - \gamma \mathbf {M}_L^{-1} \mathbf {J}_{\text {STT}}, \label {eq::STT_Prec_BDF} \end{equation}

where \(\gamma = \Delta t \beta _0\) denotes the BDF scaling factor, \(\alpha \) the Gilbert damping constant, \(C_{\text {ex}} = 2\gamma _L A_{\text {ex}}/(\mu _0 M_s)\) the exchange coefficient, \(\mathbf {M}_L\) the lumped mass matrix, \(\mathbf {K}_{\text {diff}}\) the symmetric exchange stiffness matrix arising from damping, and \(\mathbf {J}_{\text {STT}} = \mathbf {J}_{\text {DL}} + \mathbf {J}_{\text {FL}}\) comprises the damping-like and field-like Jacobian contributions defined in Equations (5.51) and (5.52).

While the damping-like contribution \(\mathbf {J}_{\text {DL}}\) is symmetric, the field-like term \(\mathbf {J}_{\text {FL}} = c_{\text {FL}}[\mathbf {S}]_\times \) is skew-symmetric. This renders the preconditioner non-symmetric, preventing the use of PCG. Furthermore, classical AMG components (in particular smoothers) are primarily designed for SPD or nearly SPD operators and can lose effectiveness for strongly non-symmetric matrices with large skew-symmetric parts [170]. Incomplete LU factorization handles such systems naturally by directly computing approximate \(\mathbf {L}\) and \(\mathbf {U}\) factors without relying on smoothing properties. The Euclid parallel ILU implementation from HYPRE [171] is employed, and the Newton linear system is solved using GMRES preconditioned by ILU. The spin-accumulation \(\mathbf {S}\) entering the STT Jacobian is held constant (frozen) between preconditioner rebuilds. During STT-driven switching, both \(\mathbf {m}\) and \(\mathbf {S}\) evolve rapidly, so the frozen-\(\mathbf {S}\) approximation becomes less accurate and the number of GMRES iterations increases. To limit this effect, the preconditioner is rebuilt periodically rather than only on demand. This is controlled via CVodeSetLSetupFrequency, set to 3 in the implementation, balancing ILU setup cost against the overhead of Krylov iterations with a stale preconditioner.

Note that the exchange-precession contribution, \(\mathbf {K}_{\text {skew}}\), is omitted from the preconditioner, although it appears in the full Jacobian. Numerically, adding \(\mathbf {K}_{\text {skew}}\) in the mass-lumped form degraded Krylov convergence (cf. Section 5.4.1). A likely reason is that the lumped scaling \(\mathbf {M}_L^{-1}\) destroys the exact skew-symmetry of the cross-product operator under the relevant inner product, producing a more difficult non-symmetric preconditioned system.

IMEX Preconditioner: In contrast, the IMEX scheme assigns only the stiff exchange damping to the implicit partition \(f_I\), while all other contributions, including spin-transfer torque, are evaluated explicitly in \(f_E\). The implicit Jacobian (5.57) involves only the symmetric exchange operator:

\begin{equation} \mathbf {P}_{\text {IMEX}} = \mathbf {I} + \gamma \alpha C_{\text {ex}} \mathbf {M}_L^{-1} \mathbf {K}_{\text {diff}}. \label {eq::STT_Prec_IMEX} \end{equation}

This matrix is SPD, enabling efficient solution via PCG with AMG preconditioning. Since \(\mathbf {J}_{\text {STT}}\) does not appear in the implicit partition, it is correctly omitted from the preconditioner, maintaining the SPD structure regardless of spin-torque magnitude.

The performance is evaluated similarly to the Standard Problem 4. For the switching simulation, a simplified MTJ structure with 1 \(\si {\nano \meter }\) thick RL, 1.7 \(\si {\nano \meter }\) thick FL, and a 1 \(\si {\nano \meter }\) thick TB is employed. The magnetization of the RL points in the positive \(x\)-direction, while the FL is initialized in the negative \(x\)-direction, with a small tilt to break symmetry. A bias voltage of \(-3\) \(\si {\volt }\) is applied to the left contact to induce switching. The structure is sized down to a diameter of 10 \(\si {\nano \meter }\) to reduce the computational cost of the tests.

Although this configuration is not physically realistic, the applied bias voltage exceeds typical operating conditions for such a thin tunnel barrier and would likely cause dielectric breakdown in practice. This setup was chosen intentionally to create a STT dominated regime. This allows for a meticulous evaluation of how the different time integrators and preconditioner strategies handle the stiff, non-symmetric dynamics arising from strong spin-transfer torque.

Figure 6.7 presents the spatially-averaged magnetization dynamics computed with the four time integration methods. The reference solution is obtained using the TPS method with a fixed time step of \(\Delta t = 0.01\,\si {\pico \second }\). To enable a meaningful comparison at similar temporal resolution, an additional TPS run with \(\Delta t = 0.1\,\si {\pico \second }\) (\(10^{-13}\,\si {\second }\)) is performed, matching the order of magnitude of the adaptive time steps achieved by the BDF solver. All implicit methods (TPS, BDF, IMEX) capture the qualitative switching behavior, with the magnetization reversing from the initial anti-parallel to the parallel configuration under the applied STT. Visible phase differences appear between the adaptive solvers and the reference, particularly in \(m_y\) around \(t \approx 0.1\,\si {\nano \second }\). These deviations reflect the inherent trade-off between accuracy and efficiency: both solvers satisfy the prescribed tolerances while taking larger time steps than the reference. The IMEX scheme exhibits approximately twice the \(e_\infty \) trajectory error of BDF (0.398 vs. 0.240, cf. Table 6.2), attributable to the splitting error from explicit STT treatment and the accumulation over \(\sim \)22000 steps compared to only \(\sim \)488 for BDF. Despite these deviations, both methods capture the correct switching dynamics and converge to the same equilibrium state. The coarser TPS run (\(\Delta t = 0.1\,\si {\pico \second }\)) achieves a \(9.7\times \) speedup but exhibits the largest trajectory error (\(e_\infty = 0.733\)), with noticeable phase drift visible in all three components. This highlights a key limitation of fixed-step methods without error control: unconditional stability allows arbitrarily large time steps, but accuracy degradation goes undetected. In contrast, RK4 shows a comparable phase deviation, but, as discussed below, this apparent agreement is misleading, as the method becomes numerically unstable during switching.

Figure 6.8 provides a quantitative analysis of solver accuracy and efficiency. Panel (a) shows the deviation from the unit-sphere constraint \(|\mathbf {m}|=1\) before renormalization. For explicit single-step methods, theoretical results establish that a scheme of order \(p\) preserves the magnetization magnitude within \(\mathcal {O}(\Delta t^{p+1})\) error per step, exploiting the cross-product structure of the LLG equation, which guarantees \(\partial _t \mathbf {m} \perp \mathbf {m}\). The TPS method, being first-order in its tangent-plane update, produces \(\mathcal {O}(\Delta t^2)\) drift per step, accumulating to deviations below \(10^{-7}\) that further decrease as the simulation progresses toward equilibrium.

In contrast, the explicit RK4 method, despite its fourth-order accuracy (and the associated \(\mathcal {O}(\Delta t^5)\) local constraint drift expected for smooth problems), becomes unstable in the switching regime, with \(\||\mathbf {m}|-1\|_\infty \) growing to \(\mathcal {O}(1)\). The rapid reorientation driven by strong exchange and STT yields a stiff discretized LLG operator whose eigenvalues leave the stability region of explicit Runge–Kutta schemes at the time steps considered, leading to numerical blow-up. Notably, the RK4 trajectory in Figure 6.7 appears to follow the qualitative switching behavior despite this failure. This misleading agreement arises because the instability primarily corrupts the magnetization magnitude while its direction still roughly follows the torque-driven dynamics. The constraint plot in Figure 6.8 (a) reveals the true extent of the breakdown, underscoring that trajectory agreement alone is insufficient to validate numerical accuracy, the unit-sphere constraint must be monitored explicitly. Consequently, RK4 is not suitable for STT-driven switching unless prohibitively small time steps are used.

The implicit multistep methods (BDF and IMEX) exhibit fundamentally different drift characteristics, maintaining deviations at around \(10^{-1}\). The \(\mathcal {O}(\Delta t^{p+1})\) bound of single-step methods does not apply here: the multistep formulation combines solution values from previous steps that each carry minor constraint violations, and the Newton iteration required for the implicit solve explores intermediate states that may lie far from the unit sphere. Consequently, these methods exhibit larger oscillating deviations that correlate with the adaptive step size, as visible in Figure 6.8 (c). Nevertheless, periodic renormalization maintains the constraint within acceptable bounds throughout the simulation.

Panel (b) reports the deviation from the reference trajectory using the maximum-in-time error metric \(e_\infty \) defined in (5.59). The BDF method maintains a deviation below \(10^{-1}\) throughout the simulation, whereas the IMEX scheme accumulates approximately twice the trajectory error. This discrepancy stems from two sources: (i) the splitting error inherent to IMEX methods, where the explicit treatment of STT introduces an additional \(\mathcal {O}(\Delta t)\) error term, and (ii) the vastly different step counts, the BDF solver requires only \(\sim \)440 steps compared to \(\sim \)22000 for IMEX, leading to different error accumulation characteristics despite identical local tolerances.

Table 6.2: Quantitative performance comparison for STT-driven switching. The trajectory error is measured by \(e_\infty \) relative to the TPS reference (baseline row). Speedup is computed as \(\text {Speedup} = \text {CPU}_{\text {TPS,ref}}/\text {CPU}_{\text {method}}\).

Solver

Configuration

\(e_\infty \)

Steps

CPU [\(\si {\second }\)]

Speedup

TPS\(_{\text {ref}}\)

\(\Delta t = 10^{-14}\,\si {\second }\)

—

40000

42705

\(1.00\times \)

TPS

\(\Delta t = 10^{-13}\,\si {\second }\)

0.733

4000

4393

\(9.27\times \)

RK4

\(\Delta t = 10^{-14}\,\si {\second }\)

0.654 (unstable)

40,000

62623

\(0.68\times \)

BDF

.
rtol=\(10^{-2}\),
atol=\(10^{-4}\)

0.240

488

12870

\(\mathbf {3.32\times }\)

IMEX

.
rtol=\(10^{-2}\),
atol=\(10^{-4}\)

0.398

22088

65345

\(0.65\times \)

Panel (c) reveals the key efficiency difference between the adaptive solvers. The BDF method, equipped with the extended preconditioner (5.53) that incorporates the STT Jacobian, achieves time steps of \(\mathcal {O}(1\,\si {\pico \second })\), two orders of magnitude larger than the IMEX scheme. The latter remains constrained to \(\mathcal {O}(0.01\,\si {\pico \second })\) due to the explicit treatment of both precession and STT, which imposes accuracy limitations from the rapidly oscillating terms in \(f_E\). This confirms that for STT-dominated dynamics, the efficiency gains from implicit exchange treatment alone are insufficient, and the spin-torque contributions must also be treated implicitly to unlock larger time steps.

Table 6.2 summarizes the performance metrics. The BDF method achieves an \(82\times \) reduction in time steps compared to the fixed-step TPS reference while maintaining acceptable trajectory accuracy. Despite requiring more RHS evaluations per step (due to Newton iterations and the coupled drift-diffusion solve), the overall computational cost is significantly reduced. The IMEX scheme provides no improvement (\(0.67\times \)) over the baseline, as the stability advantage from implicit exchange is negated by the accuracy constraints imposed by explicit STT and precession treatment. Both adaptive solvers were configured with identical tolerances (\(\text {rtol} = 10^{-2}\), \(\text {atol} = 10^{-4}\)).

Linear Solver Performance: The effectiveness of the preconditioner is measured by the number of GMRES iterations required to solve each Newton linear system and, more critically, by the achievable time step size. With the extended preconditioner (6.37) incorporating \(\mathbf {J}_{\text {STT}}\), the average GMRES iteration count remains around 40 per Newton step, and the solver achieves time steps of \(\mathcal {O}(1\,\si {\pico \second })\). Without the STT Jacobian contribution, i.e., using only the exchange-damping preconditioner (6.38) with AMG, the solver behavior degrades catastrophically: GMRES iterations exceed 100 during the switching transient, additional Newton convergence failures occur, and the adaptive time step collapses to \(\mathcal {O}(0.01\,\si {\pico \second })\), negating any efficiency advantage of implicit integration. This demonstrates that for STT-dominated dynamics, incorporating the spin-torque Jacobian into the preconditioner is not merely an optimization, but a necessity for practical simulation.

In contrast, the IMEX scheme does not suffer from this preconditioner mismatch. Since the spin-transfer torque is assigned to the explicit partition \(f_E\), the implicit Jacobian naturally contains only the exchange-damping contribution, and the SPD preconditioner (6.38) remains appropriate. However, this comes at the cost of explicit stability constraints on the STT terms, limiting the achievable time step to \(\mathcal {O}(0.01\,\si {\pico \second })\) regardless of preconditioner quality.

These results demonstrate that for strongly STT-driven dynamics, the fully implicit BDF approach with an appropriately constructed preconditioner offers substantial efficiency gains while maintaining physical fidelity, making it the preferred choice for large-scale or long-duration simulations. However, the TPS method remains attractive for its simplicity, robust constraint preservation, and straightforward implementation, it requires no preconditioner tuning. For the device simulations presented in the following sections, the TPS method is employed: its computational overhead is acceptable for the moderate problem sizes considered, and its unconditional stability and predictable behavior simplify the analysis of STT-driven switching phenomena.

Numerical Analysis and Innovative Simulation Techniques for Designing Advanced MRAM

6.8 Preconditioner Evaluation

Numerical Analysis and Innovative Simulation
Techniques for Designing Advanced MRAM