# POLITECNICO DI TORINO Repository ISTITUZIONALE

Beyond-CMOS Artificial Neuron: A simulation-based exploration of the molecular-FET

Original

Beyond-CMOS Artificial Neuron: A simulation-based exploration of the molecular-FET / Mo, F.; Spano, C. E.; Ardesi, Y.; Piccinini, G.; Graziano, M.. - In: IEEE TRANSACTIONS ON NANOTECHNOLOGY. - ISSN 1536-125X. - ELETTRONICO. - 20:(2021), pp. 903-911. [10.1109/TNANO.2021.3133728]

Availability:

This version is available at: 11583/2947956 since: 2021-12-29T10:41:53Z

Publisher:

Institute of Electrical and Electronics Engineers Inc.

Published

DOI:10.1109/TNANO.2021.3133728

Terms of use:

This article is made available under terms and conditions as specified in the corresponding bibliographic description in the repository

Publisher copyright

IEEE postprint/Author's Accepted Manuscript

©2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collecting works, for resale or lists, or reuse of any copyrighted component of this work in other works.

(Article begins on next page)

# Beyond-CMOS Artificial Neuron: A simulation-based exploration of the molecular-FET

Fabrizio Mo\*, *Graduate Student Member, IEEE*, Chiara Elfi Spano\*, Yuri Ardesi, *Graduate Student Member, IEEE*, Gianluca Piccinini, Mariagrazia Graziano

Abstract—The recent growth of Artificial Neural Networks fueled the design of numerous Artificial Intelligence (AI) dedicated hardware implementations. High power dissipation, computational complexity, and large area footprints currently limit CMOS based real-time embedded AI applications. In this work, we design and simulate through SPICE, for the first time, an artificial analog neuron based on the molecular Field-Effect Transistor (molFET) technology. MolFETs are described by a circuital model whose physical characteristics are extracted from atomistic simulations. The designed neuron is a single column of a crossbar-like circuit representing a layer of seven parallel neurons. The drain currents sum up in a soma-like circuit modelled through a comparator - and trigger the output pulses. We demonstrate the advantages of the molFET in terms of area, power, and speed by comparing it with a conventional MOSFET implementation. The results confirm the molecular technology is a promising candidate for accomplishing high neuron throughput capability and massive redundancy, still providing high energy efficiency. The obtained results foster further investigation of molFET technology both at the device and circuit level.

Index Terms—Artificial Neuron, Artificial Neural Networks, Molecular Electronics, Molecular transistor, Molecular-based circuit modeling.

#### I. Introduction

RECENTLY, Artificial Neural Networks (ANNs) gained popularity in emergent Artificial Intelligence (AI) applications such as: computer vision [1], [2], speech recognition [3], acoustical data processing [4] or, more in general, big data processing and security [5], sensors and e-noses [6], [7]. Among the possible types of ANNs, the Spiking Neural Network (SNN) model was recently exploited to emulate the human's brain [8]. The massive amount of data and large computational effort required by ANN applications promoted the development of algorithms optimized for high-performance computing [9]–[12] which are still prohibitive in terms of computational complexity and energy efficiency, strongly limiting the advancement of innovative applications. Recently, the scientific community raised the interest for ANN Application-Specific Integrated Circuits (ASICs) and accelerators for System-On-Chip (SoC) integration [13], [14].

\*The two Authors contributed equally to this work.

Manuscript sent July 22, 2021. Accepted November 30, 2021; (Corresponding authors: Fabrizio Mo).

Fabrizio Mo, Chiara Elfi Spano, Yuri Ardesi, and Gianluca Piccinini are with the Department of Electronics and Telecommunications, Politecnico di Torino, 10129 Turin, Italy (e-mails: fabrizio.mo@polito.it)

Mariagrazia Graziano is with the Department of Applied Science and Technology, Politecnico di Torino, 10129 Turin, Italy.

Digital Object Identifier xx.xxxx/TNANO.2021.xxxxxx

Their original parallel processing capabilities favour high speed, whereas architecture optimization permits low area and power consumption [15]. Unfortunately, CMOS ANN dedicated hardware is still above the so-called *power envelope* imposed by real-time embedded AI applications [16]. In this framework, viable solutions came from proposal of artificial neurons implemented with Beyond-CMOS technologies [17]–[26]. Among them, the molecular one promises high parallelism and redundancy, keeping high energy efficiency [21].

We envision two approaches in the investigation of artificial molecular neurons: (1) Consider novel molecular devices that naturally emulate a neuron or a synaptic network. As an example, [21] reports randomly deposited molecule-interconnected metal nanoparticles forming a synaptic grid; (2) Replacing CMOS transistors with molecular devices, enabling the reuse of conventional design principles and circuital architectures and admitting a direct comparison between conventional electronic implementations and molecular ones. This work is a pioneering investigation in the latter direction. Even if the technological state-of-the-art does not allow the prototyping of complex molecular circuits, the recent developments are in the direction of massively parallel fabrication of molecular devices [27]. In this context, we investigate through simulations the molecular Field-Effect-Transistor technology [28], [29] (molFET) as a candidate for implementing artificial analog neurons. Indeed, our work is an analysis of molFET advantages at the circuit level aiming at motivate and justify future efforts on molecular research. To perform such analysis, we develop a simple and effective model for the molFETs circuit simulation, and we use it to design an artificial neuron. Then, we compare the area, power and speed of the designed circuit with the same topology implemented with MOSFETs. To the best of our knowledge, the present work is the first attempt in demonstrating the advantages of migrating from standard MOSFETs to molFETs at the circuit level. Our results show promising advantages in area and power consumption, while no significant advantages are present regarding speed. Power and speed strongly depend on the molecule employed.

#### II. METHODOLOGY

To implement a neuron, or a layer of neurons, we choose a molecular technology, proved as promising in Beyond-CMOS technologies [30]–[32]. We consider the molecular counterpart of conventional transistors, i.e. the molFET. A single molecule is placed between two contacts, typically made of gold, acting as source (S) and drain (D) [28], [30]. Usually,

the molecule is chemically bonded to contacts through the socalled anchoring group (often a sulfur atom). The molecular channel is electrostatically coupled with a third metal gate electrode through the presence of an insulating gate dielectric material, as demonstrated in [28]. The current in the channel, can be modelled using Landauer's formula [33]:

$$I = \frac{2q}{h} \int_{-\infty}^{+\infty} T(E) \left[ f_S(E) - f_D(E) \right] dE \tag{1}$$

q is the electron charge, h is the Planck's constant, T(E) is the transmission spectrum representing the transmittivity of the molecular channel, E is the electron energy, and  $f_S$  and  $f_D$  are the S and D contact Fermi-Dirac's distributions, respectively. Notice that T(E) includes the number of channel modes. At zero Kelvin, (1) demonstrates that only the electron states within the two contact Fermi levels participate in conduction. At room temperature, more energy levels contribute to the conduction thanks to the thermal spreading introduced by the Fermi function "tails". The energy spacing between the contact Fermi levels is named bias window (BW) and can be controlled by the application of a D-to-S voltage  $V_{DS}$ . The application of a gate voltage modulates the current by shifting up and down T(E) [33]. Moreover, an additional back gate voltage can be used to improve the conductivity [34].

In this work, we follow a bottom-up methodology, as depicted in Fig. 1. Firstly, we calculate the I-V characteristics of the single molFET by employing Semi-Empirical (SE) QuantumATK physical calculation [35] and applying the Extended Hückel Theory (EHT) method. Given the molecular device geometry and the applied voltage values ( $V_{DS}$  and  $V_{GS}$ ), QuantumATK calculates the current through (1). In QuantumATK, the transmission function is derived through the well-established Non-Equilibrium Green's Function (NEGF) method [35]-[37]. NEGF allows for calculating the full I-V characteristics [33], [38]. The Self-Consistent Field (SCF) loop is enabled to improve the accuracy: the transport (NEGF) and the electrostatics (Poisson's equation with boundary conditions: Dirichlet in the transport direction, Neumann at the metal gate, periodic boundary conditions elsewhere) are considered self-consistently. The solution converged at  $10^{-5}$ tolerance over the Hamiltonian variable (Pulay mixing). In all the examined molFETs, the molecular channel is always strongly coupled with the electrodes thanks to the covalent Au-S bond. Consequently, we consider coherent tunnelling as the primary transport mechanism. Incoherent contributions are negligible in strong coupling regime [33], [38], [39]. Fig. 1 (a), (b), (c) show the three different molecular transistors we consider in this work. All of them with (atomistic) gold FCC (111) S and D electrodes, zirconium dioxide (ZrO<sub>2</sub>) gate dielectric (thickness: 5.7 Å) and metal gate supposed to be a perfect electrical conductor. The devices differ only for the molecules used as channels, namely: OligoPhenylEthylene (OPE3), ParaCycloPhane[3,3]-based (PCP), and Hexadecane DiThiol (HDT); already investigated in literature [40], [41]. We create a new symbol in Cadence Virtuoso associated to a Look-Up-Table (LUT) and describe each molFET with VerilogA. The LUT stores the values of the ab-initio calculated drain current given a certain  $(V_{GS}, V_{DS})$ . The LUT data are in-



Fig. 1. Representation of the bottom-up method used in our work. The arrows indicate the methodological flow. The device geometries are: (a) OPE3-based molFET, (b) HDT-based molFET, (c) PCP-based molFET. Carbon atoms in grey, hydrogen in white, gold in yellow, sulfur in green; S and D electrodes are composed of six gold layers (only three are shown). Notice also the molFET circuit symbol.

terpolated within the VerilogA description using a third-order spline function (details in A1). From now on, we refer to this model as "static LUT-based model". We use this simple model, accurate for static and quasi-static analyses, to verify the designed crossbar from a functional standpoint and estimate the static power dissipated by the circuit. To evaluate dynamic power and transients, the static LUT-based model is improved by introducing molFET electrostatic capacitances and the intrinsic times  $\tau_S$  and  $\tau_D$ , i.e. the amount of time required to move electrons from/toward the molecule toward/from the S and D. We compute the electrostatic capacitances, starting from the specific molFET physical and geometrical properties. The gate capacitance  $C_g$  is evaluated from the parallel plane approximation. Whereas, for the S and D capacitances ( $C_s$ ,  $C_d$ ), we exploit the simplified approach presented in [33], and the explicit formulae presented in [42]. Notice that, to estimate  $C_s$  and  $C_d$ , it is necessary to estimate the quantum capacitance  $(C_a)$  [33], [42]. It measures the amount of charge that can be moved toward/from the molecule in response to an external voltage variation. It considers the number of available electron states within the molecular channel (i.e. its density of states) also according to Pauli exclusion principle. The intrinsic times  $\tau_S$  and  $\tau_D$  are intimately linked to  $C_q$  and can be estimated with the approximated formulae presented in [42]. The  $I_{DS}(V_{DS})$  slope is locally approximated by means of  $R_s=\frac{\tau_S}{C_q}$  and  $R_d=\frac{\tau_D}{C_q}$ , that have resistance dimensions. Finally, we compute the gate resistance  $R_g$  as the total resistance of the gate dielectric. Additional details on capacitance and resistance calculations are reported in A2. In the following, we will always refer the results to worst-case capacitance and resistance values (largest values). We consider only intrinsic capacitance contributions. Depending on the specific fabrication process the parasitic (electrode-substrate) contributions can be negligible or even dominant. Since we do not refer to a particular fabrication process, we assume them to be negligible, like typically happens in Self-Assembled-Monolayer (SAM) based fabrication processes [43]. We embed all the mentioned device parameters along with the static LUT-

model to create a new Cadence Virtuoso symbol, by using the equivalent circuit topology presented in [42]. We refer to this refined model as "dynamic LUT-based model", and we use it to estimate the dynamic power dissipated by the molFETs and the transient duration.

Summarizing, we use the static LUT model to perform functional verification and static power estimation of the designed crossbar arrays, while the dynamic one permits the dissipated dynamic power and transient analysis. The LUT-based circuit modelling adopted in this work is widely used in literature to simulate circuits based on emerging technologies lacking compact circuit models [44], [45]. Indeed, it is simple, accurate, and computationally efficient, especially if compared to SCF [46] or SPICE-like models [47], even if it loses, at runtime, any link with the physics of the devices.

Since the molFETs are field-effect devices, the circuit design principles are similar to those used for MOSFETs. Thus, the crossbar design is performed using conventional methods and criteria by employing the mentioned circuit models. In particular, the threshold voltage of the comparators, the pull-up resistances and the gate and supply voltage values should ensure the correct behaviour of the neuron (details in III).

Finally, to highlight the advantages of the molFET implementation w.r.t. the MOSFET one, the artificial neuron is also designed with MOSFET devices. A 32 nm technological node is chosen to make the comparison more meaningful with the single-gate molFET devices considered in this work since it still corresponds to planar devices. For MOSFET technology, this work uses the Generic PDK BSIM4 (v.4.5) [48].

#### III. CIRCUITAL CHOICE

Four parts mainly constitute a neuron [49]: (1) the soma is the core of the neuron which generates the action potential if the weighted sum X of its input signals  $x_i$  overcomes a certain threshold  $V_{th}$ ; (2) the dendrites, tree-like receptive terminations which carry the input signals  $x_i$  into the soma; (3) the axon propagates the neuron action potential toward other neurons through (4) the synapses. Two operators are required to emulate the basic behaviour of a biological neuron: the weighted sum over all  $x_i$  and the possibility to emulate a threshold mechanism. The step function can implement the simplest threshold mechanism: y(X) = 0 if  $X < V_{th}$ , y(X) = 1 if  $X \geq V_{th}$ . Fig. 2(a) shows a single neuron circuital implementation. It is a column of a crossbar array, commonly used for neuromorphic applications [50], [51]. The single neuron is constituted by: (1) seven molFETs emulating the dendrites; (2) a pull-up resistance  $R_{PU}$ ; (3) a voltage comparator. The operation of the sum is performed by exploiting Kirchhoff's Current Law. The threshold mechanism is created through a voltage comparator. Fig. 2(b) shows the complete 7x7 crossbar array. The proposed circuit emulates a neural layer of 7 parallel analog neurons sharing the same input lines (i.e. the same dendrites). The redundancy created by seven neurons working in parallel emulates biological neuron parallelism. A squared crossbar is chosen for simplicity. A neuron with seven dendrites is chosen to ease the design procedure. An odd number of inputs allows to easily determine



Fig. 2. (a) A single column of a crossbar array terminated by the comparator emulating a single neuron. The common drain line performs the operation of the sum over all the input signals coming from the dendrites, whereas the comparator emulates the threshold mechanism. (b) The complete schematic of the 7x7 crossbar array structure. On the left side, there are the inputs (generators) on transistor gates, while on top the supply line along with pull-up resistances. The drain lines are fed with a suitable  $V_{DD}$  supply voltage.

TABLE I
DESIGNED COMPONENT VALUES

|                        | MOSFET | HDT   | OPE3  | PCP-LP | PCP-HP |
|------------------------|--------|-------|-------|--------|--------|
| $V_{DD}$ (V)           | 1      | 1     | 1     | 1      | 1.3    |
| $V_{GS,ON}$ (V)        | 1      | -2    | -1    | 2      | 2      |
| $V_{GS,OFF}$ (V)       | 0      | 0     | 1     | -2     | -2     |
| W (nm)                 | 250    | 10    | 10    | 10     | 10     |
| $R_{PU}$ (k $\Omega$ ) | 1      | 100   | 100   | 100    | 100    |
| $V_{th}$ (mV)          | 585    | 928.5 | 576.6 | 718.4  | 879.6  |

the threshold. Indeed, the threshold is overcome (i.e. the output of the single neuron should be activated) if at least four molFETs are turned ON. The neuron does not fire if three (or less) molFETs of the same column (i.e. neuron) are ON. Table I reports designed gate voltage values enabling the transistor to be in ON  $(V_{GS,ON})$  and OFF  $(V_{GS,OFF})$  states, supply voltage  $(V_{DD})$ , pull-up resistance  $(R_{PU})$ , comparator threshold  $(V_{th})$ and channel width (W). The mentioned values are designed to meet the following conditions: (a) If all molFETs on the same drain line are switched off, there is ideally no voltage drop across  $R_{PU}$ . Notice that the drain line and the input voltage of the comparator (on the negative terminal) are connected to  $V_{DD}$ : small leakages of molFETs and comparator introduce a small voltage drop on  $R_{PU}$ . (b) If less than four molFETs are switched on by input spikes, a current flows through them. As a result, there is a significant voltage drop on  $R_{PU}$  w.r.t. the condition (a), and the comparator input voltage is reduced. A large number of molFET in the ON state increases the voltage drop on  $R_{PU}$ , eventually reducing the comparator input voltage. The reduction must be lower than the threshold, thus the output of the comparator remains stuck at ground. (c) If four or more than four molFETs are in ON state, the voltage drop on  $R_{PU}$  must reduce the comparator input voltage above the threshold. Consequently, the output of the comparator rises at  $V_{DD}$  (ON output). All component values are designed to have the specific structure correctly working (additional details in A3). If the number of input dendrites is modified, the design should be repeated. Nevertheless, if the number of inputs of the single neuron is maintained fixed, many neurons

can be cascaded without the burden for re-designing. Two different design solutions are investigated for the PCP-FET: an High Performance (HP) and a Low Power (LP) version. We assume ideal voltage comparators, behaviourally described in Virtuoso with functional blocks. They are aimed in verifying the functionality of the designed synaptic neural layer, which is essential for neuromorphic computation. Similarly, authors in [19] focused on the performance of a crossbar based on carbon nanotube transistor technology, leaving the op-amps, the somalike circuit and the (off-line) weight updating to an external unit (PC) suitably connected to the crossbar. We postpone the molFET-based voltage comparator design to future works.

The advantages of the proposed neuron are the simple dendrite topology, the good threshold emulation and operability also in analog spiking regime.

#### Neuron weights

So far, the proposed neuron consider all the inputs having the same weight. To implement weights, we connect more transistors in parallel to shared input line, as depicted in Fig. 4(b). Mentioned transistors can be dynamically connected or disconnected to the input line through switches that are controlled by proper command signals. If more than one input parallel transistor is connected to the line, the total current flowing in  $R_{PU}$  due to that specific input spike is multiplied by the number of connected transistors, providing input pulse weighting capabilities. We tested the crossbar circuit with 2 and 3 transistors per input (i.e. 98 and 147 transistors in total), and we verified its correct functional behaviour (Fig. 4). Notice that this weighting method permits discrete weights only (i.e. the number of transistors connected to a given input line). The method easily enables real-time online training, yet it requires a complex feedback network for handling the weights. Fig. 6(c)) shows an external control unit, placed in feedback to the layer of neurons, providing the command signals for weight updating. The control unit is not designed with molFETs and is not optimized since it only aims at verifying the functional behaviour of the overall closed-loop system (details in A4).

Another possible solution to implement weights is to exploit a back-gate electrode. It enables the modulation of the molFET conductance, permitting current modulation and thus run-time weighting. Nevertheless, this solution would require detailed device engineering to effectively implement the back gate, which is out of the scope of this work, thus postponed to future works. This work aims at comparing the MOSFET and molFET technologies. In the following section, we analyse the neuron performance by considering equal input weights. Indeed, the performance ratio between the two technologies is unchanged depending if weights are considered or not.

#### IV. RESULTS

According to the methodology described in section II, we first characterize the molFETs in QuantumATK. Fig. 3(a),(b),(c) show the current  $I_{DS}(V_{DS})$  of the three molFETs at fixed  $V_{GS}$ . The considered gate voltages correspond to the molFET ON and OFF states:  $V_{GS,ON}$ ,  $V_{GS,OFF}$ . Our results for PCP- and OPE3-molFET with null  $V_{GS}$  are

similar to the SAM-based experimental ones reported in [40]. To ease the comparison with results in [40] we report such output characteristics in semi-logarithmic scale in Fig. 3(f). Fig. 5 reports additional device level plots of the three molFETs. In this work we exploit the HOMO-type conduction (equivalent p-type) for the HDT- and the OPE3-molFETs, whereas the LUMO-type conduction (equivalent n-type) for the PCP-based one. We choose different molFET polarity to maximize  $I_{ON}/I_{OFF}$  in all cases. For HDT-FET only HOMO conduction is possible since transmittivity of LUMO peaks is at higher energies than the interval within the BW (Fig. 5 (a)). The application of  $V_{GS}$  perturbs the channel Molecular Orbitals (MOs), i.e. the probability density per unit volume to find an electron in a given spatial region. In particular, a suitable  $V_{GS,ON}$  supports the electron delocalization in the channel, which, in turn, promotes the S-D electron tunnelling. T(E) is thus enhanced, increasing the current. As a case of study, we consider in detail the PCP-FET. Fig. 3(d), (e) report its Lowest Unoccupied Molecular Orbital (LUMO) for  $V_{GS,ON}$  and  $V_{GS,OFF}$ . When the  $V_{GS,OFF}$  is applied at the gate terminal, the LUMO is extremely localized on the left side, Fig. 3(e), creating a barrier on the left phenyl ring preventing electrons from moving from LUMO, i.e. the S (right electrode), to D (left electrode), thus lowering T(E)and the currents. In confirmation of this, Fig. 3(g) shows the potential energy along the channel. The dashed circle highlights a small potential barrier for electrons on the phenyl ring close to the D, further enhancing the S-to-D barrier. We believe this barrier be intimately linked with the LUMO shape at  $V_{GS,OFF}$  in that region, as we confirm in the following. The situation changes when considering  $V_{GS,ON}$ , Fig. 3(d). The LUMO well delocalizes along the channel, supporting the electron tunnelling mechanism. To better understand the transmission and confirm what said so far, we analyze the main transmission eigenstate (TE) within the bias window when the PCP-FET is ON. Fig. 3(h) shows a real space projection of the TE corresponding to the maximum transmission coefficient (eigenvalue). Along with all the others, such a TE contributes to the final transmission function in the energy domain T(E). The main TE resembles the LUMO when  $V_{GS,ON}$  is applied, apart from a small phase -colour- mismatch, confirming that the main tunnelling path is through the LUMO, and the previous considerations are confirmed. Further confirmation of the channel left-side barrier modulation using the gate voltage is possible through Mulliken's population analysis: a measure of the electronic charge distribution among the system atoms. When  $V_{GS,ON}$  is applied, Mulliken's charge increases on the left phenyl ring atoms, especially for the left anchoring group (sulfur). The larger number of electrons populating that region indicates a decreased potential energy barrier. In summary, the gate voltage modulates the channel potential barrier not only for the specific case of the LUMO (i.e. the main tunnelling path), but for all the MOs, as a general property of the system. Moreover, it causes a charge redistribution in the channel, as mentioned in section II. The mentioned current modulation enabled by the gate voltage is similar to traditional MOSFETs. We use input pulses on gates to enhance the molFETs T(E) for a limited time, with



Fig. 3. (a), (b), (c) Output characteristics for  $V_{GS,ON}$  (blue) and for  $V_{GS,OFF}$  (red) of HDT, OPE3, PCP -FETs respectively; (d), (e) PCP-FET LUMO at fixed  $V_{DS}=1V$  (i.e. the maximum operating value in the crossbar circuit) for  $V_{GS,ON}$  and  $V_{GS,OFF}$  respectively (colour indicates the phase); (f) Output characteristics for  $V_{GS}=0$  V in semi-logarithmic scale for OPE3, PCP -FETs, respectively. (g) 1D potential energy along the device transport direction z. The geometry of the PCP molecule is superimposed to the curve to note the molecule position on z. The potential barrier in correspondence of the left phenyl ring is highlighted with a dashed circle; (h) Main transmission eigenstate of the T(E) LUMO peak, when the PCP-FET is in ON state.

a consequent current increase. The  $I_{DS}$  is then converted into a voltage drop on  $R_{PU}$ . This mechanism well fits with the neuron concept we described in section III and guarantees that the neuron input and output are voltage signals with desired values. In particular, we verify the correct neuron behaviour in the spiking analog regime by performing DC and transient simulations. We analyse the circuit by using the described LUT models. For functional verification, we use the static model. Hence, we apply slow input spikes (1 ms rise/falling edge, 8 ms of width), to ensure no transient effects affect the final result. Fig. 4(a) reports the functional verification for unitary neuron weights. The neuron behaves as expected. Fig. 4(c) reports an example of functional verification for different neuron weights (1 ms rise/falling edge, 4 ms of width). Again, the neuron behaves as expected. This result is also confirmed for all the possible weights values. In the following we report the neuron performance analyses.

#### A. Area

The main advantage of molecular electronics is the area reduction, which is gained thanks to the intrinsic nanometric size [30]. This work quantitatively compares the transistor area occupied with molecular and CMOS technologies. We neglect



Fig. 4. (a) Neuron functional verification with unitary weights: the seven input spikes are the dashed lines, the comparator input is the solid green line, and the comparator output is the solid blue one. The neuron output is active when at least four of the seven neuron inputs are active simultaneously. (b) Sketch of the discrete implementation of neuron weights. Molecular transistors in parallel can be connected or disconnected to a single input through the switches. Switches are controlled by proper command signals provided by a properly designed control unit for weight updating (see A4). (c) Functional verification with discrete weights: the red input weight is 2, all the others 1. The output is active when two inputs in addition to the red one are active.

the area occupied by lines, comparators, and pull-up resistors. Indeed, comparators and  $R_{PU}$  are supposed to be implemented with the same technology and interconnections are supposed to be effectively scaled without heavily impacting on parasitic resistances (e.g. by acting on innovative low resistive materials or interconnection processes), thus the proportion between areas are essentially unchanged. The analysis is meant to be the first attempt at a quantitative comparison, not a precise estimation of the layout area.

The area of the whole crossbar array is computed as  $W \cdot L \cdot N$ , where W is the gate width, L is the channel length, and N is the number of the transistors in the circuit. The molecular channel length is measured directly in QuantumATK as the molecule extension along the transport direction, excluding the electrodes. L results to be:  $2.45\,\mathrm{nm}$  for OPE3,  $2.60\,\mathrm{nm}$  for HDT, and  $2.44\,\mathrm{nm}$  for PCP molecule. For all molFETs, we consider a standard width of  $100\,\mathrm{\mathring{A}}$  for the gate terminals. The MOSFET channel length is assumed to be  $32\,\mathrm{nm}$  and its width  $250\,\mathrm{nm}$  (typical design value for  $32\,\mathrm{nm}$  n-MOSFET). The results in table II exhibit the advantage of migrating from conventional MOSFET to a molFET technology. More than four orders of magnitude in area reduction can be achieved, enabling more devices to be packed in the same area, favouring redundancy and enhancing the parallelism per unit area.

#### B. Average power

This work compares both the static and the dynamic average (active) power dissipated by the overall circuit. We consider only the contributions of the transistors and the pull-up resistors. The static power is estimated in static conditions (i.e. crossbar fed by DC generators and all transient extinguished) as  $V \cdot I$ , where V is the voltage drop across the component  $(V_{DS})$  for the transistor,  $V_R$  for the pull-up resistors) and I the current flowing in the component  $(I_{DS})$  for the transistor,  $I_R$  for the pull-up resistors). This single power contribution

TABLE II LIST OF RESULTS

| Technology  | A (nm <sup>2</sup> ) | P <sub>leak</sub> (nW) | $P_{tot,OFF}$ (nW) | $P_{tot,ON}$ (µW) | $P_{tot,dyn}$ (pW/pulse) | $\tau$ (ps) |
|-------------|----------------------|------------------------|--------------------|-------------------|--------------------------|-------------|
| MOSFET      | 392000               | 68.00                  | 68.60              | 4542.5            | 3.334                    | 1.155       |
| OPE3-FET    | 24.54                | 10660.00               | 13110.00           | 36.59             | 0.798                    | 1.168       |
| HDT-FET     | 26.50                | 2.67                   | 2.77               | 6.96              | 0.636                    | 2.212       |
| PCP-FET(LP) | 26.19                | 21.34                  | 21.35              | 27.73             | 3.927                    | 3.046       |
| PCP-FET(HP) | 26.19                | 424.51                 | 424.73             | 32.52             | 3.762                    | 1.751       |

is then multiplied by the number of transistors N and pull-up resistors M to evaluate the total dissipated static power as  $N(V_{DS} \cdot I_{DS}) + M(V_R \cdot I_R)$ . Table II reports the total dissipated static power  $P_{tot,OFF}$ , evaluated when the circuit is in an idle state (i.e. all the 49 transistors are OFF, all molFET current contributions correspond to leakages), and the dissipated static power  $P_{tot,ON}$  when all transistors are in the ON state. The dissipated dynamic power is estimated as the energy provided to charge/discharge the device electrostatic capacitances over the time required to carry out the commutation. The total exchanged energy per commutation for a molFET (for a single charging or discharging phenomenon of device electrostatic capacitances) is computed as:

$$E = \frac{1}{2}C_gV_G^2 + \frac{1}{2}C_sV_{dot}^2 + \frac{1}{2}C_d(V_{DS} - V_{dot})^2$$
 (2)

 $V_G$  is the voltage drop across  $C_g$ , whereas  $V_{dot}$  computed as  $V_{DS} \cdot C_d/(C_s + C_d)$  is the voltage drop across  $C_s$  (Fig. 6(a)). The obtained results show that migrating from the MOSFET to the molFET technology generally reduces the static and dynamic power. However, for the static power, the OPE3-FET presents a considerable leakage current  $I_{OFF}$ , w.r.t. to HDT and PCP molFETs and also, surprisingly, w.r.t. MOSFETs. This strongly impacts on  $P_{tot,OFF}$ . The large leakage is due to the weak electrostatic control of the  $V_{GS}$ on the molecular channel, which obstructs the OPE3-FET to switch OFF correctly. Fig. 3(b) shows the ON current is only about six times the OFF one. The situation is different for the PCP-FET (LP) and the HDT-FET, thanks to the previously discussed effective gate modulation. The PCP-FET (LP, HP) presents a slightly larger dynamic power w.r.t. MOSFET. This deterioration is due to its large gate voltage swing (twice the molecular ones and quadruple w.r.t. MOSFET one) and its slightly larger capacitance w.r.t other molFETs.

# C. Speed

The LUT-based dynamic model permits estimating the minimum reliable intrinsic time  $\tau$ , that we assume to be seven time the exponential transient time constant. Such an intrinsic time  $\tau$  is a device figure of merit, and it differs from the contact intrinsic time mentioned in section II, which in turns is a measure of the contact-molecule interface quality. We compute  $\tau$  as follows. A single device is considered, in the same topology as it is placed in the designed crossbar (Fig. 6(b)). It is thus connected to a  $R_{PU}$  with the same value of the one in the crossbar, and it is fed by the design  $V_{DD}$ . Then an ideal step is applied in input on its gate. The drain node is monitored, and the transient time constant is evaluated

starting from the Cadence simulated data through the well-known circuit theory graphical method. Then,  $\tau$  is assumed to be the transient duration, i.e. seven times the estimated time constant (worst case). Table II reports the intrinsic time for each considered technology. The testing topology and examples of simulation results are reported in Fig. 6.

In terms of intrinsic time (i.e. speed), migrating from MOSFETs to molFETs does not present benefits. The OPE3-FET and PCP-FET (HP) are only slightly slower than MOSFET due to their significant ON currents. The OPE3-FET is not wholly switched OFF in the circuit. Hence, it can be switched ON rapidly w.r.t. other molFETs, but even slower than the MOSFET. Whereas, HDT-FET and PCP-FET (LP) show a relevant reduction of the maximum achievable speed. A different molFET operating point may be used to tune the performance and satisfy possible design constraints (in analogy with MOSFETs), as confirmed by the notable improvement obtained with the PCP-FET HP w.r.t. the LP one.

### V. CONCLUSIONS

We implement a simple and effective LUT-based model for molFETs, and we use it to design an analog artificial neuron. We design and functionally verify a molFET-based neural layer composed of seven artificial analog neurons sharing the same inputs. The neuron performance is compared to a conventional MOSFET implementation to highlight the advantages of molFET technology. From our results, the gain in performance obtained migrating from MOSFETs to molFETs generally depends on the specific molecule used as channel. HDT molecule reduces dissipated leakage power by a factor of 22 (68 nA vs. 2.67 nA), and the dissipated ON static power by a factor of 650. Even the dynamic power per pulse is reduced by about a factor of 5. In terms of area, the advantage is remarkable and independent of the molecule. Four orders of reduction can be achieved  $(392000 \,\mathrm{nm}^2 \,\mathrm{vs.}\, 25 \,\mathrm{nm}^2)$ . The drawback regards the speed of molFET devices: Because of small ON current, the intrinsic time of HDT-FET and PCP-FET (LP) are respectively the double and the triple w.r.t. MOSFET. We believe that the MOSFET-molFET comparison carried out in this work can be generalized to any architecture besides the neuromorphic ones, leading to similar advantages.

An important outcome of our work is that the molFETs present significant performance variations depending on the considered molecule. This work motivates further investigations, at the device level, to find the best molecule for a given application (e.g. low power/high speed). For speed requirement, we believe a suitable molecule can be used to improve the performance, at the cost of dissipated power, in analogy to

what happens for MOSFETs. Moreover, depending again on the molecule, molFETs may present exclusive features, like the Negative-Differential-Resistance (NDR) trend in the output characteristics of the PCP-FET (Fig. 3(c)). We believe NDR can be exploited in novel circuit topologies, implementing standard or innovative functionalities. For example, the NDR peak over a  $V_{DS}$  sweep resembles the typical neuron action potential. We postpone to future works the investigation of the NDR for molFET neuromorphic circuits.

To sum up, because of all these reasons, our results are promising in motivating future investigations on molecularbased circuits, both at technological and system levels.

### VI. APPENDIX

**A1. LUT-based models.** The model to which we refer with static LUT-based model consists on the following VerilogA file that simply calls a LUT.txt file which collects the values of drain current per pair of driving voltages  $(V_{GS}, V_{DS})$  (simulated *a priori* only once with QuantumATK), and interpolates them by means of a third order spline function with the command at line 12:

#### A2. Capacitances modeling and evaluation for molFETs.

The gate capacitance is calculated by exploiting the parallel plate approximation as  $C_g = (Area \cdot \epsilon_R \cdot \epsilon_0)/t_{OX}$ ; where  $t_{OX}$ is the gate oxide physical thickness,  $\epsilon_0$  the vacuum permittivity,  $\epsilon_R$  the gate dielectric relative permittivity (ZrO<sub>2</sub>). The gate resistance is evaluated as  $R_q = (\rho \cdot t_{OX})/Area$ ,  $\rho$  is the ZrO<sub>2</sub> resistivity. Source and drain capacitances are calculated by using the model presented in [42]. They represent the (average) channel charge modulation in response to an applied voltage and account for the molecule state filling (i.e. the quantum capacitance  $C_q$ ). The equilibrium  $C_q$  is evaluated from the definition [33], [42]:  $C_q = q^2 DOS(E_F)$ ; q is the elementary charge, DOS(E) the molecular channel Density of States,  $E_F$  the Fermi level. The non-equilibrium  $C_q$  was evaluated with the same formula, yet the DOS is averaged over the BW range of energies (arithmetic mean). In the purely ballistic transport, the channel is modelled as a node of an electrical network, leading to the model of Fig. 6(a). From which the (average over space) potential energy modulation of the molecular channel changes due to external voltages is:

$$\delta U_{tot,AV} = -q \frac{C_g}{C_{ES} + C_q} \delta V_{GS} - q \frac{C_d}{C_{ES} + C_q} \delta V_{DS} \quad (3)$$

Where  $C_{ES} = C_g + C_s + C_d$ . Starting from simulated outputand trans-characteristics the capacitive ratios can be derived. By knowing  $C_g$  and  $C_q$  and assuming symmetric coupling  $(C_s = C_d)$  the source and drain capacitances are known:

$$\frac{\delta U_{tot,AV}}{\delta V_{GS}}\Big|_{V_{DS}=0} = -q \frac{C_g}{C_{ES} + C_q} \Rightarrow \frac{C_g}{C_{ES} + C_q}$$

$$\frac{\delta U_{tot,AV}}{\delta V_{DS}}\Big|_{V_{GS}=0} = -q \frac{C_d}{C_{ES} + C_q} \Rightarrow \frac{C_d}{C_{ES} + C_q}$$
(4)

The symbol  $\delta$  indicates a deviation from the DC operating point. A wide voltage variation can be seen as a sequence of small perturbations. The maximum obtained capacitance value can be then used in the LUT model in the worst-case approximation. The source and drain "dynamic" resistances measure the transferred charge from/toward the contacts toward/from the molecule in the unit time. They are related to the  $I_{DS}(V_{DS})$  slope, and they depend on the quantum capacitance. The source contribution to current is:

$$I_{S} = \frac{q}{\tau_{S}} \int_{-\infty}^{+\infty} DOS(E - U_{tot,AV}) [f_{S}(E) - f(E, E_{Fdot})] dE$$

$$\approx \frac{Cq}{\tau_{S}} \frac{E_{FS} - E_{Fdot}}{q} \Rightarrow R_{s} = \frac{\tau_{S}}{C_{q}} = \frac{\hbar}{C_{q}(ES_{SD} - H_{SD})}$$
(5)

 $S_{SD}$ ,  $H_{SD}$  are the source-device overlap and coupling Hamiltonian matrixes. Analogous equations hold for drain.

A3. Neural layer design. The design was performed by maximizing the ON current of the molFET in order to maximize the voltage drop across  $R_{PU}$  when the input is active. This allows to have the maximum separation in terms of comparator input voltage (called local field). The NDR trend in molFETs must be accounted for in this process: Depending on how many inputs are simultaneously active, the drain voltage and the current change (the output characteristics are not flat). Thus, one can choose  $R_{PU}$  to have a maximum  $I_{DS}$  (at  $V_{DS} = 1.1 \, \mathrm{V}$  in the PCP-molFET case) when 4 over 7 inputs are active. This leads to a maximum variation on the local field when 3 or 4 inputs are active, i.e. maximum noise margins. In formulae, the comparator input voltage is equal to the molFET  $V_{DS}$ :

$$V_{DS} = V_{DD} - R_{PU} \left[ \sum_{ON} I_{DS_{ON}} + \sum_{OFF} I_{DS_{OFF}} \right]$$
 (6)

In the design, once the ON and OFF working points are chosen (ON and OFF gate voltages and relative current values are fixed), the  $V_{DD}$  and  $R_{PU}$  are fixed to have the desired  $V_{DS}$  when a desired number of transistors are active. In the aforementioned example it is 1.1 V when 4 PCP-molFETs are ON (HP case). In this way, the neuron behaves at its best when 4 inputs are active, i.e. around its threshold, in the condition in which it must discriminate between ON and OFF states. The behavior in the other cases will be "less ideal". The cases of 3 and 4 ON inputs can then be considered. The comparator threshold can be fixed to be the mean between the two (to maximize the noise margins).

A4. Feedback control unit for weight updating. We connected more transistors in parallel to the same input line



Fig. 5. Additional device level plots. (a) Transmission spectrum at equilibrium of the three molFETs. Energy is in absolute values, the Fermi level is at  $-9.5\,\mathrm{eV}$  in these plots. (b) Transmission spectrum at  $V_{GS}=0V$  as a function of  $V_{DS}$  and energy. For the PCP-molFET, for an applied drain bias above 1 V, the LUMO level is shifted outside the bias window, thus leading to less transmission and less current. This explains the NDR behaviour (dashed red circle). Large drain voltage makes shift more evident. (c) Transcharacteristics  $I_{DS}(V_{GS})$  of the three molFETs at fixed  $V_{DS}=0.1V$  and  $V_{DS}=1V$ .



Fig. 6. (a) Capacitive model of molFETs. (b) Circuit topology used for time constant evaluation. (c) The block *Neuron Layer* is the artificial neuron layer presented in this work, *Weight Adjust* is the control unit placed in feedback for weight updating; in1 to in7 are the neuron inputs, out1 to out7 are the neuron outputs, w1 to w7 are the switch control signals. (d),(e),(f),(g) Examples of transient simulations for time constant evaluation for the three molFETs. The topology tested is the one shown in (b); the interpolations are performed in MATLAB; the red circles highlight the intercept from which the intrinsic times are estimated as seven times the decay interval.

to implement the weights. The transistors are dynamically connected or disconnected to the input line through switches (Fig. 4(b)). The switches are controlled by command signals provided by an external control unit. The weight is encoded in the output current collected in the common drain line. Thus the number of connected parallel transistors encodes

(discrete) weights. As proof of concept, and in order to verify the functional behavior of the system, we designed, by means of a VHDL description, a digital control unit to activate the switches. It is placed in a feedback loop to the 7-input neuron as sketched in Fig. 6(c). The control unit is able to provide the command signals to the neuron and so update the weights according to Hebb's rule. In particular, it has an asynchronous (event-driven) interface, and it oversamples the neuron inputs and output at a much shorter clock period than the supposed spike duration. It counts the number of asynchronous events happening at the neuron inputs and output, and increases the weight of a synapse (i.e. connects more parallel transistors) if that input is active when an output is produced. Moreover, it reduces the weights of the unused inputs. The number of input events that trigger an output event before the weight is updated was arbitrarily set to 3, 5, 7 and 10. In all cases, the control unit correctly increased/decreased the weights of active/inactive inputs as soon as the output resulted active.

## REFERENCES

- [1] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet classification with deep convolutional neural networks," in *Advances in Neural Information Processing Systems*, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, Eds., vol. 25. Curran Associates, Inc., 2012.
- [2] G. W. Lindsay, "Convolutional neural networks as a model of the visual system: Past, present, and future," *Journal of Cognitive Neuroscience*, vol. 33, no. 10, p. 2017–2031, Sep 2021.
- [3] A. A. Dibazar, H. H. Namarvar, and T. W. Berger, "Continuous speech recognition using dynamic synapse neural network," *The Journal of the Acoustical Society of America*, vol. 115, no. 5, pp. 2612–2612, 2004.
- [4] M. J. Bianco, P. Gerstoft, J. Traer, E. Ozanich, M. A. Roch, S. Gannot, and C.-A. Deledalle, "Machine learning in acoustics: Theory and applications," *The Journal of the Acoustical Society of America*, vol. 146, no. 5, pp. 3590–3628, 2019.

- [5] H. Kuang, J. Wang, R. Li, C. Feng, and X. Zhang, "Automated dataprocessing function identification using deep neural network," *IEEE Access*, vol. 8, pp. 55411–55423, 2020.
- [6] A. J. Kell and J. H. McDermott, "Deep neural network models of sensory systems: windows onto the role of task constraints," *Current Opinion* in Neurobiology, vol. 55, pp. 121 – 132, 2019, machine Learning, Big Data, and Neuroscience.
- [7] W. Hu, L. Wan, Y. Jian, K. Jin, X. Bai, H. Haick, M. Yao, and W. Wu, "Electronic noses: From advanced materials to sensors aided with data processing," *Advanced Materials Technologies*, vol. 4, no. 2, 2019.
- [8] E. Izhikevich, "Simple model of spiking neurons," IEEE Transactions on Neural Networks, 2003.
- [9] A. Coates, B. Huval, T. Wang, D. Wu, B. Catanzaro, and N. Andrew, "Deep learning with COTS HPC systems," in *Proceedings of the 30th International Conference on Machine Learning*, ser. Proceedings of Machine Learning Research, S. Dasgupta and D. McAllester, Eds., vol. 28, no. 3. Atlanta, Georgia, USA: PMLR, 17–19 Jun 2013.
- [10] C. Farabet, C. Couprie, L. Najman, and Y. Lecun, "Learning hierarchical features for scene labeling," *IEEE transactions on pattern analysis and machine intelligence*, 2013.
- [11] E. Alberti, A. Tavera, C. Masone, and B. Caputo, "Idda: A large-scale multi-domain dataset for autonomous driving," *IEEE Robotics and Automation Letters*, 2020.
- [12] L. Cavigelli, M. Magno, and L. Benini, "Accelerating real-time embedded scene labeling with convolutional networks," in *Proceedings of the 52nd Annual Design Automation Conference*, 2015.
- [13] L. Cavigelli, D. Gschwend, C. Mayer, S. Willi, B. Muheim, and L. Benini, "Origami: A convolutional network accelerator," *Proceedings* of the 25th edition on Great Lakes Symposium on VLSI, 2015.
- [14] K. Ovtcharov, O. Ruwase, J.-Y. Kim, J. Fowers, K. Strauss, and E. Chung, "Accelerating deep convolutional neural networks using specialized hardware," February 2015.
- [15] D.-A. Nguyen, H.-H. Ho, D.-H. Bui, and X.-T. Tran, "An efficient hardware implementation of artificial neural network based on stochastic computing," in 2018 5th NAFOSTED Conference on Information and Computer Science (NICS), 2018, pp. 237–242.
- [16] R. Andri, L. Cavigelli, D. Rossi, and L. Benini, "Yodann: An ultralow power convolutional neural network accelerator based on binary weights," in 2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), 2016.
- [17] A. Kumar Kamal and J. Singh, "Simulation-based ultralow energy and high-speed lif neuron using silicon bipolar impact ionization MOSFET for spiking neural networks," *IEEE Transactions on Electron Devices*, 2020.
- [18] S. Dutta, V. Kumar, A. Shukla, N. R. Mohapatra, and U. Ganguly, "Leaky integrate and fire neuron by charge-discharge dynamics in floating-body MOSFET," *Scientific Reports*, 2017.
- [19] C.-L. Chen, K. Kim, Q. Truong, A. Shen, Z. Li, and Y. Chen, "A spiking neuron circuit based on a carbon nanotube transistor," *Nanotechnology*, vol. 23, no. 27, p. 275202, jun 2012.
- [20] M. Sharad, C. Augustine, G. Panagopoulos, and K. Roy, "Spin-based neuron model with domain-wall magnets as synapse," *IEEE Transactions on Nanotechnology*, vol. 11, no. 4, pp. 843–853, 2012.
- [21] A. Balliou, J. Pfleger, G. Skoulatakis, S. Kazim, J. Rakusan, S. Kennou, and N. Glezos, "Programmable molecular-nanoparticle multi-junction networks for logic operations," in *Proceedings of the 14th IEEE/ACM International Symposium on Nanoscale Architectures*. Association for Computing Machinery, 2018, p. 37–43.
- [22] E. Toomey, K. Segall, and K. K. Berggren, "Design of a power efficient artificial neuron using superconducting nanowires," Frontiers in Neuroscience, vol. 13, p. 933, 2019.
- [23] M. Song, W. Duan, S. Zhang, Z. Chen, and L. You, "Power and area efficient stochastic artificial neural networks using spin-orbit torquebased true random number generator," *Applied Physics Letters*, vol. 118, no. 5, p. 052401, 2021.
- [24] D. Kaushik, U. Singh, U. Sahu, I. Sreedevi, and D. Bhowmik, "Comparing domain wall synapse with other non volatile memory devices for on-chip learning in analog hardware neural network," AIP Advances, vol. 10, no. 2, p. 025111, 2020.
- [25] J. Y. Kim, M.-J. Choi, and H. W. Jang, "Ferroelectric field effect transistors: Progress and perspective," APL Materials, vol. 9, no. 2, p. 021102, 2021.
- [26] B. Kiraly, E. J. Knol, W. M. J. van Weerdenburg, H. J. Kappen, and A. A. Khajetoorians, "An atomic boltzmann machine capable of selfadaption," *Nature Nanotechnology*, 2021.

- [27] V. Dubois, S. N. Raja, P. Gehring, S. Caneva, H. S. J. van der Zant, F. Niklaus, and G. Stemme, "Massively parallel fabrication of crack-defined gold break junctions featuring sub-3 nm gaps for molecular devices," *Nature Communications*, vol. 9, no. 1, p. 3433, Aug 2018.
- [28] H. Song, Y. Kim, Y. H. Jang, H. Jeong, M. A. Reed, and T. Lee, "Observation of molecular orbital gating," *Nature*, vol. 462, no. 7276, pp. 1039–1043, Dec 2009.
- [29] D. Xiang, H. Jeong, D. Kim, T. Lee, Y. Cheng, Q. Wang, and D. Mayer, "Three-terminal single-molecule junctions formed by mechanically controllable break junctions with side gating," *Nano Letters*, 2013.
- [30] J. M. Tour, "Molecular electronics. synthesis and testing of components," Accounts of Chemical Research, vol. 33, no. 11, pp. 791–804, 2000.
- [31] Y. Ardesi, A. Pulimeno, M. Graziano, F. Riente, and G. Piccinini, "Effectiveness of molecules for quantum cellular automata as computing devices," *Journal of Low Power Electronics and Applications*, vol. 8, no. 3, 2018.
- [32] Y. Ardesi, G. Turvani, M. Graziano, and G. Piccinini, "SCERPA simulation of clocked molecular field-coupling nanocomputing," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 29, no. 3, pp. 558–567, 2021.
- [33] S. Datta, Quantum transport: atom to transistor. Cambridge University Press, 2005.
- [34] A. Zahir, S. A. A. Zaidi, A. Pulimeno, M. Graziano, D. Demarchi, G. Masera, and G. Piccinini, "Molecular transistor circuits: From device model to circuit simulation," 07 2014, pp. 129–134.
- [35] QuantumATK version Q-2019.12, Synopsys QuantumATK www.synopsys.com/silicon/quantumatk.html.
- [36] S. Datta, "Nanoscale device modeling: the Green's function method," Superlattices and Microstructures, vol. 28, no. 4, 2000.
- [37] D. Stradi, U. Martinez, A. Blom, M. Brandbyge, and K. Stokbro, "General atomistic approach for modeling metal-semiconductor interfaces using density functional theory and nonequilibrium green's function," *Physical Review B*, vol. 93, no. 15, p. 155302, 2016.
- [38] S. Datta, Electronic transport in mesoscopic systems. Cambridge University Press, 1995.
- [39] S. Karthäuser, "Control of molecule-based transport for future molecular devices," *Journal of Physics: Condensed Matter*, vol. 23, no. 1, p. 013001, nov 2010.
- [40] C. Jia, M. Famili, M. Carlotti, Y. Liu, P. Wang, I. M. Grace, Z. Feng, Y. Wang, Z. Zhao, M. Ding, X. Xu, C. Wang, S.-J. Lee, Y. Huang, R. C. Chiechi, C. J. Lambert, and X. Duan, "Quantum interference mediated vertical molecular tunneling transistors," *Science Advances*, vol. 4, no. 10, 2018.
- [41] F. Zahid, M. Paulsson, E. Polizzi, G. A.E., S. L., and S. Datta, "A self-consistent transport model for molecular conduction based on extended Hückel theory with full three-dimensional electrostatics," *The Journal Of Chemical Physics*, no. 123, 2005.
- [42] M. Baldo, Introduction to Nanoelectronics. MIT OpenCourseWare Publication 2011
- [43] M. A. REED, C. ZHOU, M. R. DESHPANDE, C. J. MULLER, T. P. BURGIN, L. JONES II, and J. M. TOUR, "The electrical measurement of molecular junctions," *Annals of the New York Academy of Sciences*, vol. 852, no. 1, pp. 133–144, 1998.
- [44] L. Huichu, S. Vinay, N. Vijaykrishnan, and D. Suman, "III-V tunnel FET model," 2015. [Online]. Available: https://nanohub.org/publications/12/2
- [45] W. Wang, H. Xu, Z. Huang, L. Zhang, H. Wang, S. Jiang, M. Xu, and J. Gao, "Channel and gate workfunction-engineered CNTFETs for lowpower and high-speed logic and memory applications," *JSTS: Journal of Semiconductor Technology and Science*, 2016.
- [46] M. Graziano, A. Zahir, A. Mahmoud, A. Pulimeno, G. Piccinini, and P. Lugli, "Hierarchical modeling of OPV-based crossbar architectures," 08 2014.
- [47] A. Mahmoud and P. Lugli, "Toward circuit modeling of molecular devices," *IEEE Transactions on Nanotechnology*, vol. 13, no. 3, pp. 510– 516, 2014.
- [48] W. Liu. and C. Hu, BSIM4 and MOSFET Modeling for IC Simulation. World Scientific Publishing, Singapore, 2011.
- [49] E. Mtui, G. Gruener, and P. Dockery, Fitzgerald's Clinical Neuroanatomy and Neuroscience. Elsevier, 2020.
- [50] J. Starzyk and Basawaraj, "Memristor crossbar architecture for synchronous neural networks," Circuits and Systems I: Regular Papers, IEEE Transactions on, vol. 61, pp. 2390–2401, 08 2014.
- [51] N. Dey, J. Sharda, U. Saxena, D. Kaushik, U. Singh, and D. Bhowmik, "On-chip learning in a conventional silicon MOSFET based analog hardware neural network," 07 2019.