# POLITECNICO DI TORINO Repository ISTITUZIONALE

## Magnetic QCA Design: Modeling, Simulations and Circuits

Original

Magnetic QCA Design: Modeling, Simulations and Circuits / Graziano, Mariagrazia; Vacca, Marco; Zamboni, Maurizio - In: Cellular Automata - Innovative Modelling for Science and Engineering / Salcido A.. - Vienna : INTECH, 2011. - ISBN 9789533071725. - pp. 37-56

*Availability:* This version is available at: 11583/2375487 since:

Publisher: INTECH

Published DOI:

Terms of use:

This article is made available under terms and conditions as specified in the corresponding bibliographic description in the repository

Publisher copyright

(Article begins on next page)

# Magnetic QCA Design: Modeling, Simulation and Circuits

Mariagrazia Graziano, Marco Vacca and Maurizio Zamboni Electronics Department, Politecnico di Torino Italy

### 1. Introduction

QCA, in their general meaning, are bistable cells coupled through electrostatic forces. There are two main implementations of this principle: molecular and magnetic QCA. In molecular QCA the base cell is represented using complex molecules with many oxide-reduction centres. They have a great potential, mainly for the high speed reachable, but they are not expected to be feasible with current and near term technology. On the contrary, magnetic QCA, where the base cells are single-domain nanometer pills-shaped magnets, are feasible right now, using high end electron beam lithography (EBL). Single domain nanomagnets exhibit two stable magnetic states, "up" and "down", carrying thus a binary information thanks to their reciprocal influence based on a "domino" effect. Even thought their perspective frequency is not high, they are interesting because they represent the real possibility to build a full magnetic electronic circuit, with all the inherent advantages that this implies: extreme low power consumption and intrinsic memory ability, i.e. the possibility to implement circuits with mixed computational and memory abilities.

Many works have been proposed in recent years either on circuits and architectures (see for all (1),(2),(3)) or on detailed physical nanomagnets behavior analysis (see for an example (4)). In the former case it is shown that, should technology be of support, computation would be feasible, and many of the traditional CMOS based digital implementations could be adopted. In the latter case, experiments are carried on to demonstrate the feasibility of the MQCA idea adopting to date production phases. Anyway, the two aspects are seldom integrated, thus the design approach is splitted in two averted points of view, while, as we have experienced especially in the recent CMOS technology success story, the strength of a design methodology is based on the ability of linking them as much as possible. We believe thus that the current scientific MQCA scenario requires to entangle circuit design with technology when the aim is demonstrating the feasibility of the MQCA computation paradigm.

In this chapter we show our approach for twisting computation and implementation. The idea is to assess a practicable and not theoretical technological implementation for MQCA; to constraint the circuit design to such a base; to solve at the architectural level, when possible, the critical limitations due to it; to describe the circuit behavior so that real implementation details can be taken into account, thus circuit performance can be estimated and feedbacks to technologists suggested; to step progressively to the physical implementation and to enrich the circuit description with on the field data. Here we show the preliminary results of such approach. In section 2 the base theory of quantum dot cellular automata is explained,

while in section 3 the magnetic implementation is described. In section 4 the fundamental technological hypotesis, the so-called "snake-clock" implementation, is explained. In section 5 an example of circuit description is given, followed (section 6) by a specific architectural solution adopted and with the "low-level" details added to it. In section 7 the preliminary technological implementation are described. In section 8 the conclusion and the direction of our future research are summarized.

#### 2. Quantum dot Cellular Automata (QCA)

Quantum dot Cellular Automata (QCA) are a recent (5),(6),(7),(8) specific application to microelectronics of the original cellular automata principle (9). The logic values 0 and 1 are represented using bistable charge configurations of many identical square shape cells (10). The base cell, shown in figure 1.A, is constituted by four quantum dots, one for each corner. Each dot can be occupied by electrons, however, if two are the available electrons and since they tend to repel each others, at the equilibrium only the two dots on the diagonal will be filled. Square cells have only two diagonals, therefore only two are the possible states, which represent the two logic values 0 and 1.



Fig. 1. Quantum dot Cellular Automata (QCA) cells. A) Four dots cells. B) Six dots cells.

Circuits can be built placing many cascaded cells (11), and using electrostatic interaction among them to drive the information through the circuit. An example of a simple circuit, a wire, is shown in figure 2. Starting from the initial status indicated in figure 2.A, if the first cell is forced from 0 to 1 using an external electric field (figure 2.B), the second cell will switch due to the electrostatic interaction between adjacent electrons (figure 2.C). This process will continue until the end of the wire (figure 2.D).



Fig. 2. Quantum dot Cellular Automata (QCA) wire. A) Starting condition, each cell is 0. B) First cell is forced to 1. C) Second cell switches to 1, due to the electrostatic interaction. D) Third switches to 1.

Following this principle many types of circuits can be built. In particular, four are the blocks which perform the basic logic operations (figure 3): the wire (figure 3.A), the inverter (figure 3.B), the majority gate (figure 3.C) and the crosswire (figure 3.D). The inverter and the majority gate are the two blocks which enables the logic computation, but, while the inverter performs a simple signal inversion, the logic equation of the second one is uncommon: the value of the output is equal to the value of the majority of the inputs. The last fundamental block, the crosswire, represents a special feature of this technology: it allows the crossing of two different signals on the same plane without interferences.



Fig. 3. Quantum dot Cellular Automata (QCA) basic blocks. A) Wire. B) Inverter. C) Majority gate. D) Crosswire.

The limitations of this theoretical principle are two: the energy barrier between different states is high, while the electrostatic interaction between neighbour cells is normally too low to force switching in a neighbour cell; second, circuits composed by too many cascaded cells are subjected to error propagation due to electromagnetic and thermal noises. To solve these problems an external control flow, the so-called clock (12), is introduced. A modified cell with six dots (shown in figure 1.B) is needed in this case. When an external electric field is applied, the cell is forced to an unstable state called NULL, lowering the potential barrier between the two stable states. When the field is removed the cell switches to 0 or 1, depending on the neighbour cells. Since only circuits composed by a limited number of cells can work without error propagation, a spatial flow control system is introduced. The circuit is divided in small areas, composed by a limited number of cells, called clock zones. Every zone is separately controlled using a time varying clock signal as shown in figure 4. In the classical definition circuits are divided in four clock zones, the circuits partition and the clock signal waveforms are in figure 4.



Fig. 4. Clock mechanism. A) Clock zones. B) Clock signals.

When the clock signal is high (V = VH) the potential barrier between the two logic states is risen and therefore the cell switch is impossible. In this case the cells are in HOLD state. When the clock signal decreases from VH to VL the potential barrier decreases its value slowly, cells start to switch from a stable state to an unstable one. Cells are in the RELEASE phase. When the clock signal is low (V = VL) the potential barrier is zero, the two logic states are not separated. Cells are in the RELAX state. Finally when the clock signal rises from -1 to +1 the

potential barrier increases its value slowly forcing cells in a stable state: cells therefore are in the SWITCH state. As clear from figure 4.B, the clock signal is always identical, but applied with a different phase to other clock zones. This allows the spatial propagation of the signal through the circuit as shown in figure 4.A. During the first time step the clock zone number 2 is in the switch phase, they are in an unstable state and are read to switch to one of the stable states. Cells at its left are in the hold state and act like an input, while cells on its right are in an unstable state so they have no influence, allowing the correct switching of the cells in clock zone 2. During the second time step the situation is the same, but in this case the clock zone number 3 is in the switch phase.

Signals in this way propagate correctly through the circuit, however the signal flow is monodirectional. In order to propagate signals in every directions a complex clock zone layout is required (13), for example the one shown in figure 5 which assures signal propagation in every direction. Such layouts require that the clock signal must be confined perfectly to the clock zone, and do not interfere with the neighbour zones.



Fig. 5. Complex QCA clock zones layout.

The theoretical principle of the QCA can be implemented in different ways. Four are the proposals in literature for a real QCA implementation; they are briefly mentioned in the following.

- **Metal QCA** (10)(12). The base cell is constituted by six metal lines, that act like quantum dots, on a substrate of silicon oxide. Metal lines are separated by tunnel-junction, which allow the electrons to exchange between neighbour dots. The charge configuration of the cell is read using a single electron transistor (SET). The cell works properly, unfortunately only at temperatures near the absolute zero.
- **Semiconductor QCA** (14)(15). Complex eterostructures of Si-Ge or GaAs are used to create quantum dots that are able to trap electrons. The operation temperature is higher than the metal QCA but is always too low for practical uses. In order to increase the operation temperature cell dimensions must be reduced at some nanometers, but this is impossible with current technology. Moreover, one condition necessary for proper operations of QCA circuits is that every cell must be identical, but, if a so complex structure is realized with

the desired resolution, the impact of defect rate caused by the fabrication process will make QCA inoperable, limiting every practical possibility of this implementation.

- **Molecular QCA** (16)(17)(18)(19). In this case, complex molecules with many oxide-reduction centres, which act like quantum dots, are used as base cell. Electrons can react with every centre inside the molecule, changing the spatial distribution of the electric charge, and the logic value associated to it. The use of molecules bring many advantages, like that every QCA cell is identical to each others and the fact that molecules circuits can work at room temperature. However the most interesting aspect of the use of molecules is that the switching speed expected, from one charge configuration to another: it grants the possibility to obtain operating frequency of some THz; moreover the dimensions of such molecules are very small (a few nanometers), allowing the generation of circuits with a very high device densities. Molecular QCA are very attracting but their realization requires the ability of manipulating single molecules, which is not possible with up-to-date technology.
- **Magnetic QCA** (20). The base cell is a single domain nanomagnet, with only two possible magnetizations which represent the two logic value 0 and 1. This is the second promising implementation of the QCA principle, because also in this case circuits can work at room temperature. Unfortunately the expected speed is lower not only than the molecular case, but also than CMOS circuits. However magnetic QCA have some specific advantages which make them attractive, in particular the low power consumption and the possibility to realize them with current technology: this allows to experiment and study the QCA principle so that most of the achievements can be adapted in a near future to molecular QCA, as soon as this solution becomes feasible.

### 3. Magnetic QCA

The idea of building a logic completely based on magnetic elements is not new, as it dates back to sixty years ago. However at that time it was not possible to realize such circuits, due to technology limitations; on the contrary today QCA principle seems the perfect way to implement a fully magnetic logic. In the Magnetic Quantum dot Cellular Automata (MQCA) implementation the basic cell is a nanoscale nanomagnets, with sizes between 50nm and 100nm. Magnetic materials are composed by magnetic domains, small areas with a uniform magnetization, and the behaviour is governed by the hysteresis cycle, which represents how material magnetization (M) changes if an external magnetic field (H) is applied, see figure 6.A. If the dimensions of the magnetic materials are less then approximately 100nm, there is only one magnetic domain and the hysteresis cycle change as shown in figure 6.B. In this case at the equilibrium only two magnetizations are possible, and they represent the two logic values 0 and 1. At the same time dimensions must be bigger than approximately 50nm, to avoid the superparamagnetic effect, which makes the magnetization varying with thermal fluctuations. Another important aspect is that nanomagnets must have one side bigger than the others, with an aspect ratio of almost 2, so that the magnetization is forced along the long side, the so-called easy axis. This is due to the shape anisotropy: when a magnetic material is magnetized a demagnetization field is generated, this field reaches its minimum along the longer axis of the materials, therefore the magnetization, at the equilibrium, tend to stay parallel to the easy axis.

Magnetic QCA can reach a speed of about 1GHz; however they have some significant advantages:



Fig. 6. A) Multidomain magnetic material hysteresis cycle. B) Singledomain magnetic material hysteresis cycle. C) Magnetic Quantum dot Cellular Automata (MQCA) cells.

- they are one of the only two implementations of the QCA principle that works at room temperature;
- they can be realized with current technology, with high end electron beam lithography;
- they have a very low power absorption, requiring an energy of 15-30*K*<sub>b</sub>*T* for every nanomagnets to switch, granting the possibility to obtain very low power electronic devices;
- they have an intrinsic memory ability, as, due to their magnetic nature they maintain the information stored also without power supply, enabling thus to define circuits with mixed computational-storage abilities;
- most of the high level research related to MQCA can be transposed to the molecular QCA, once technology will be ready.

### 4. A feasible three phases clock

It has been demonstrated (see (21)) that for MQCA, as well as for molecular QCA, an adiabatic switching is preferred to assure a correct information propagation. This means that switching of a nanomagnet from state "up" to state "down" is favoured if an intermediate state is reached first. That is, similarly to what mentioned in section 2, an external field is applied so that the pill "memory" (previous magnetization state to "up" or "down") is erased (magnetization become a perpendicular to "up" or "down" direction), and, at this point, as soon the external field is released, an input can more easily and with lower power loss force the new "up" or "down" magnetization to the pill. This is particularly important when the input of nanomagnet-B is another nanomagnet-A, which can force on the coupled nanomagnet-B only a limited magnetic field due to its intrinsic characteristics (shape and material).

Such external field is meant as a clock, as it is iteratively switched on and off and allows the evaluation phase, even though it has not the "traditional" function of a clock signal. One of the related aspects is the clock organization in complex computational structures. In this chapter, thus, we propose, starting from (22), a solution to clock distribution, "snake-clock", initially presented in our works in (23),(24) which we judge more feasible for the multiple-phases clock distribution, necessary to allow the information propagation without losses in complex nanomagnets arrays previously described. First, phases are three, differently from the four ones introduced for the molecular case: in figure 7.a the RESET, SWITCH and HOLD sequence



Fig. 7. Proposed clock organisation: a) Threes phases sequence, b) MQCA behaviour in three zones, c) Snake clock logic zones view, d) Snake clock layout top view e) Snake clock physical distribution.

is shown both in time and space. In figure 7.b the behaviour of nanomagnets grouped (each clock phase should serve a group of pills for the unavoidable size difference between the pills and the metal line generating such a phase) in the correspondent clock zones is depicted: when a cell group is in the HOLD phase the stable "up" and "down" states own the digital information; thus they can force it toward the near group which is in the SWITCH state, that is, its previous state has been previously "erased" thanks to a reset, and now these pills are ready to be influenced. The group in the following region is itself in the RESET state. In figure 7.c the top view of the clock zones is sketched together with the information flow: this style still allows the information flow in both the horizontal and vertical directions, but, as the correct phase sequence (1,2,3 in figure) must be assured, only a "snake" like propagation is possible. Even if this seems a limitation, it is feasible with current technology, differently from previously proposed solutions. This is also evident in the layout and physical views 7.d and 7.e respectively. The nanomagnets arrays can be sandwiched between two thin oxide layers, and on the top and bottom of this structure metal wires carrying the clock signal can be routed. One stripe (phase 1) can be straight, while the others (phases 2 and 3) should be routed in a zig-zig style, interlaced as they were twisted but belonging to two different metal layers. Active nanomagnet pills cannot be present in zones where metal wires are oblique. As

proposed in (22), clock lines could be based on copper wires of an height to be accurately tailored, as demonstrated in (25), in order to trade off between the correct magnetic field generated to assure reset and the power consumption due to current flowing.

#### 5. Snake-clock NCL-HDL MQCA model description

The proposed structure influences the circuit model we describe. We use VHDL, as proposed in (2) and (26) for general QCA, to model the circuit behaviour, but we also renew the description specifically to the magnetic implementation and to the "snake-clock" related information propagation. An example of a logic gate described in the following is in figure 8, where registers are associated to phase transitions (each register is indeed placed on a different phase zone), and combinational gates (in this case a Majority Voter, that is the basic QCA component) to the "computational" part of a clock zone (in this example, due to problems with available space, phase three only has one computational block). Wires and blocks are composed of arrays and structures of nanomagnets as in the "arrow" example in figure 7.c detail.



Fig. 8. Th22 behavioural model: a) Schematic, b) Th22 and majority voter logic equations, c) Registers clock signals.

One of the most critical problem in QCA is the "layout=timing" one, that is the overall timing and correct behaviour of QCA depends on the circuit layout. As an example, inputs to a generic Majority Voter (MV) should change synchronously to assure a correct computation. This can be easily assured in simple circuits by equalizing the number of magnets carrying the three signals to the MV so that no skew is sensed. But in complex circuits it could be impossible to assure such requirement.

In (27)(28)(29),(30) NCL is proposed as a possible solution. Null Convention Logic<sup>*TM*</sup> (NCL, (31)), is an asynchronous logic where the delay insensitivity is reached through the encoding of every signal with two bits. In this way a signal can assume two different states, a DATA state, where DATA can be either '01' (which stands for '0' logic value), or '10' (which stands for '1' logic value), and a NULL state, characterized with '00'. The '11' state is forbidden. The delay insensitivity is assured by the fact that the gate changes its status from NULL to DATA only when all the input signals switch from NULL to DATA. In this way a circuit works properly even if there is a large difference in the propagation delay of signals.

Here we adapt NCL logic to the magnetic and snake-clock case with our HDL representation (24) starting from the proposal in (32). Therefore we have implemented all the NCL logic gates.

Figure 8 shows the simplest NCL gate, the Th22. The pipelined behaviour of QCA circuits is modeled using one register for each clock phase with the signal shown in figure 8.c, while the majority voter is an ideal combinational gate with no delay. The logic function of both the Th22 and the majority voter (MV) are detailed in figure 8.b.

The VHDL code for the MV modeling is straightforward and is reported in the following: a simple boolean function and an implicit process as signal assignment is used.

Registers are implemented in a generic way (entity *reg*), so it is possible to use the same entity in every clock zone.

```
entity reg is
generic (n bit reg : integer := 32);
port (d in: in std logic vector (n bit reg - 1 downto 0);
      d out: out std logic vector (n bit reg - 1 downto 0);
      reset, clock: in std logic);
end reg;
architecture behavioural of reg is
begin
 p: process (clock, reset)
 begin
    if reset='1' then
      d out <= (others => '0');
    elsif (clock'event and clock='1') then
         d out <= d in;
    end if:
  end process;
end behavioural;
```

These components are then used for modeling the TH22 (entity *th*22) in the following code: each register belongs to a specific clock zone, thus its clock input pin is connected to the proper external clock phase. Port maps are labelled according to the registers name in figure, while signals are not detailed in figure for sake of clarity.

```
entity th22 is
port (a, b: in std_logic;
    y: out std_logic;
    clk1, clk2, clk3: in std_logic);
end th22;
architecture behavioural of th22 is
    component mv
```

```
port (a, b, c: in std logic;
          y: out std logic);
 end component;
 component req
    generic (n_bit_reg : integer := 32);
    port (d_in: in std_logic_vector (n_bit_reg - 1 downto 0);
          d_out: out std_logic_vector (n_bit reg - 1 downto 0);
          reset, clk: in std logic);
 end component;
signal z2_m_out, z2_out, z3_out, z4_out: std_logic;
signal z2f_out, z3f_out: std logic;
signal z1 out: std logic vector (2 downto 0);
begin
Z1 R: reg
    generic map (n_bit_reg => 3)
    d_out => z1_out, reset => '0', clk => clk1);
Z2 M: mv
    port map (a => z1 out(0), b => z1 out(1), c => z1 out(2), y => z2 m out);
Z2 R: reg
    generic map (n bit reg => 1)
    port map (d_in => z2_m_out, d_out => z2_out, reset => '0', clk => clk2);
Z3 R1: req
    generic map (n bit reg => 1)
    port map (d in => z2 out, d out => z3 out, reset => '0', clk => clk3);
Z4 R1: req
    generic map (n bit req => 1)
    port map (d in => z3 out, d out => z4 out, reset => '0', clk => clk1);
Z2F R1: req
    generic map (n bit req => 1)
    port map (d in => z4 out, d out => z3f out, reset => '0', clk => clk2);
Z3F R1: req
    generic map (n bit reg => 1)
    port map (d_in => z3f_out, d_out => z2f_out, reset => '0', clk => clk3);
y <= z4_out;</pre>
```

end behavioural;

Simulation results, obtained using Modelsim (33), are shown in figure 9. The output of a MV is '1' when at least two inputs are '1', and similarly the output is '0' when at least two inputs are '0'. The input of the MV is signal R (3 bit), which represents signals A, B and F" delayed of one clock phase. When two of them (signals R(0) and R(1) corresponding to inputs A and B) switch from '0' to '1' the output of the majority voter switches to '1'. Signals R', R" and also the output F corresponds to the MV output delayed by one, two and three phases, respectively. So from this, it is possible to see that output F changes to '1' only when all inputs A and B go to '1', with a delay of one three-phase clock period which correspond to the time necessary to pass through three clock phases. When the output F is '1' this signal propagates back to the MV input (signals F', F" and R(2)), so also when one of the inputs A and B change from '1' to



Fig. 9. TH22 simulation.

'0', the output F still remains '1'. Only when A and B switch to '0' the output F changes to '0', again with a delay of one clock period. Therefore this simulation confirms the logical equation of the TH22 shown in figure 8.

### 6. NCL-HDL magnetic QCA circuit example

To demonstrate the use of this modeling, and this logic, we implemented a more complex fully magnetic QCA circuit. One of the most simple and meaningful circuit is the full adder, which is the base computational block of every digital machine. Figure 10 shows the NCL full adder circuit. The circuit is splitted in two parts which calculates the two coded bits of the output. The behaviour of NCL gates can be understood looking at its symbols: a gate changes



Fig. 10. Full adder NCL implementation. The circuit is splitted in two parts following the two bit coding of the NCL logic.

from 0 to 1 only when a number of inputs equal to the number written inside the gate symbol switches from 0 to 1. The small number written before some of the inputs means that this very input has a weight double than others inputs.

The VHDL code of the full adder, reported below, allows to understand the globally asynchronous locally synchronous (GALS) behaviour of the circuit. The NCL gates are ruled by a three phases synchronous clock, while the connections among the gates are asynchronous.

```
entity FA is
 port (
   A0, A1, B0, B1 : in std_logic;
    cin0, cin1
                           : in std_logic;
   sum0, sum1 : out std_logic;
cout0, cout1 : out std_logic;
    clk1, clk2, clk3 : in std logic);
end FA;
architecture structural of FA is
  component th23
                                 : in std_logic;
    port (a, b, c
                                 : out std logic;
          V
          clk1, clk2, clk3
                               : in std_logic);
  end component;
  component th34w2
   end component;
  signal th1 out, th2 out : std logic;
begin
  Th23 1 : th23 port map (
      a \Rightarrow A0, b \Rightarrow B0, c \Rightarrow cin0, y \Rightarrow th1_out,
      clk1 => clk1, clk2 => clk2, clk3 => clk3);
  Th23 2 : th23 port map (
      a \Rightarrow A1, b \Rightarrow B1, c \Rightarrow cin1, y \Rightarrow th2 out,
      clk1 => clk1, clk2 => clk2, clk3 => clk3);
  Th34w2 1 : th34w2 port map (
      a \Rightarrow th2 out, b \Rightarrow A0, c \Rightarrow B0, d \Rightarrow cin0, y \Rightarrow sum0,
                     clk2 => clk2, clk3 => clk3);
      clk1 => clk1,
 Th34w2 2 : th34w2 port map (
     a \Rightarrow th1_out, b \Rightarrow A1, c \Rightarrow B1, d \Rightarrow cin1, y \Rightarrow sum1,
     clk1 => clk1, clk2 => clk2, clk3 => clk3);
cout0 <= th1 out;
cout1 <= th2 out;
end structural;
```

We employed the full adder to implement a 4 bit ALU, which is the fundamental block of every digital circuit. The circuit architecture is composed of two main parts (figure 11), a logic block which performs the logic OR and the logic AND, and an arithmetical unit which is able to add and subtract fixed point numbers, connected using two multiplexers for the operation selection. The logic block internal structure is shown in figure 11 upper detail, where two NCL gates for every bit implement the desired function (symbol X indicates the bit number, from 0 to 3). The arithmetic block is a ripple carry adder, where 4 full adders are connected serially. The upper-right detail of figure 11 shows the internal structure of the multiplexers, the same

structure is repeated for each bit. The subtraction is performed with an addition between input A and the inverse of the input B, while the selection bit is sent to the first multiplexer and to the carry in input of the ripple carry adder. The inversion in NCL logic does not require any gate, because the two bits which represent the encoded signal can be simply switched (in NCL the inverse of 01 is 10). The two logic gates Th12 and Th22 assure that the overflow signal is always zero when the logic block is selected.



Fig. 11. Alu complete schematic. In the two details are shown respectively the schematic of the logic block and the multiplexer. The structure is repeated identically for every bit, the X indicates the bit number.

The simulation waveforms in figure 12 show all the possible ALU operations: note that the entire structure is organized in phases correspondent to the the snake clock and the basic cells are reorganized and based on the MV block. More in detail, the information is not a sequence of data, but, as visible in the simulation bottom line, is an alternative sequence of data (D) and null state (N). The N-D-N sequence assures that before a new data is evaluated, both inputs must go to 0, independently on their delay. The NULL state works as a timing reference for the circuit. This solves the layout=timing problem. It is worth noticing that the detailed signal propagation is assured by the sequence of the three clock phases mentioned in figure 8 and 7. For this reason our architecture is locally synchronous and globally asynchronous.

From figure 12 the correctness of the circuit behaviour can be established. The timing performance are not representative because our aim was only to demonstrate the feasibility

of the MQCA structure and not to evaluate their maximum speed. Thus a full magnetic and snake-clock NCL structure has been demonstrated here and assures promising potentialities for further architectures developments, as it solves a critical limitation of MQCA and is also based on a practicable clock structure.



Fig. 12. Alu simulation results. Numbers indicate the operation performed and the NULL cycle (all waves at zero) represents the time reference of the circuit.

The description has been also enriched with other physical related information as *power* dissipation, based on values in (34),(35), (36) and using a VHDL-AMS description, and *layout*, as a dependency on the number of magnets necessary to carry a signal from a gate to the other is inserted in the model (detail in figure 13). Figure 13 shows the estimation of how the power dissipation changes increasing the complexity of the circuit: the obtained value are very promising for low power architectures.



Fig. 13. Power dissipation estimation. The energy dissipated by each nanomagnet during the switching is supposed  $30K_BT$  while the clock frequency used is 110 MHz.

We plan to further improve these descriptions related to the physical implementation as we believe that not only the architecture is more physically meaningful and variations to it can be decided on a technology basis, but also feedbacks to technology solutions can be generated, so that they can evolve towards usable actions.

This is the reason why we are setting up our own experiments, still at the preliminary phase at present time and driven by results partially presented in literature (e.g. (4)), but with the aim of jointly proof the feasibility of architectures and specifying meaningful objectives to physical experiments.

### 7. Preliminary experiments

Investigating a so advanced nanotechnology require a strong link between the high level design phase and the low level technological realization. Results in this direction have been obtained by previous works and reported in (4), (37), (38) and (39). We believe this to be a very important step, and thus we are setting up our own experiments, which preliminary results are in (24) and based on (40), to verify our proposed clock solution and to deeply investigate nanomagnetic circuits. The process for the creation of the nanomagnet structure is represented in figure 14. First a copper wire is created with sputtering, and an insulating layer of  $SiO_2$  is deposed. Then, after the deposition of a PMMA (polymethylmethacrylate) trenches are opened using electron beam lithography (EBL), figure 14.a. At this point the ferromagnetic material, mainly cobalt, is deposed using RF magnetron sputtering, figure 14.b. Finally with a lift-off process the PMMA is removed with the excess of ferromagnetic material, figure 14.c, leaving only the desired nanomagnets. The magnetic state of nanomagnets is read using a magnetic force microscope (MFM).



Fig. 14. Magnetic QCA realization process. a) PMMA deposition and EBL trench creation. b) Deposition of magnetic material. c) Lift-off.

#### 8. Conclusions and future work

The work described in this chapter represents the first attempt to integrate high level description and technological level information, in order to obtain a practically feasible solution for Quantum dot Cellular Automata circuits. However a lot of work is still required. In particular, our research will focus on three different levels: a logic/synthesis level, a simulation level and a technological level, to be carried on simultaneously and taking into account feedbacks from others levels. We believe this is a mandatory step because, as we have seen from the work presented in this chapter, separately considering one single aspect of the design in this technology can lead to wrong or physically infeasible results.

From the logical point of view a first problem that must be addressed is the development of a synthesis methodology. The problem is twofold: first the basic gate is the majority voter, therefore logic circuits should be optimized upon this gate (41); second the layout=timing problem requires a dedicated approach: synthesizers must assure that the length of the input wires of each majority voter (the number of clock zones crossed by the wires) must be identical. A proposed synthesizer structure is shown in figure 15. Starting from a VHDL description, which can be either structural or behavioural, circuits are synthesized on a universal two-level logic, in this way the tool can be adapted also to different nanotechnologies. From this point the logic net is mapped on an optimized logic set, which in the case of QCA circuits is composed by the majority voter and the inverter. An intermediate step is possible at this point, because if a delay insensitive logic, like the NCL, or others types of logic, like a mixed boolean-NCL logic, are selected, the circuit net can be previously mapped on these logics, and then the logic gates must be implemented using majority voters. When the majority voter netlist is obtained, the circuit layout can be generated. If the two-level logic was mapped directly on majority voters, without the use of a delay insensitive logic, when the layout is generated the propagation delay of the inputs, expressed in terms of clock cycles, of every majority voter must be equalized, to avoid the layout=timing problem. A few works on these points have been proposed (42), (43),(44) but none of them undertakes at the same time layout=timing, NCL/boolean cohesistance, high and low level synthesis on majority voters. Thus the need of a comprehensive synthesis and layout methodology still arises from literature.

Another problem that must be addressed at logic level is the clock zone layout, which should be designed in order to guarantee the feasibility of the clock generation structure, as we have shown in this chapter. However other problems must be considered during the clock zone layout generation, like the clock zones sizes, the maximum number of nanomagnets for each zone, the crosstalk among neighbour clock zones and the reduction of the wasted area.

From the simulation point of view a lot of work is necessary; in particular there is the need of better understanding, using classical micromagnetic simulation tools, the basic interactions among nanomagnets. For example the interaction between nanomagnets coupled with the long side could be different from the interaction between nanomagnets coupled with the short side. Also the basic logic gates require more investigation, as for example the crosswire,



Fig. 15. Nanotechnologies circuit synthesizer.

the block which allows that two different wires cross without interferences on the same plane, is critical for the development of this technology (45). However classical micromagnetic tools are not well suited for simulations on nanomagnets. Micromagnetic tools, solve the micromagnetism dynamic equation, the so-called LLG (Landau-Lifschitz-Gilbert), applying the finite element method. The magnetic material is divided in small areas, the finite elements, and on every elements a system of equation is solved. This is a good approximation of a multidomain material, because each finite element represents a magnetic domain. However when a nanomagnet is analyzed, it is divided into finite elements, but in this case it is a bad approximation, because nanomagnets are single domain magnets. This leads to results that are physically not feasible. To solve this problem it is necessary to modify micromagnetic simulation tools or to develop a simulator dedicated to magnetic QCA. Even in this case many works focused on micromagnetic simulations contributing to understand a few of the abovementioned problems (37). Anyway still many aspects remain unclear and have not been tackled with a real implementation point of view. Our research plan, started with Comsol (46) and Magsimus (47) simulations includes this step too (25).

Finally, for what concerns technology, the work should follow the guide lines obtained from the low level simulations. The interaction between nanomagnets coupled with the long side or the short side must be deeply analyzed, the time performance must be evaluated and also the crosstalk between adjacent nanomagnets must be carefully inspected. Another point of analysis are the basic logic gates, in particular the crosswire which is critical to asses the further development of this technology. The clock system is another part which requires deep investigations: its feasibility and its effective behaviour must be verified on the field (48). However, since magnetic QCA circuits are indicated for low power circuits design, other clock solutions, for example to drive nanomagnets using electric field, must be studied, because the power consumption of the magnetic field generation system can erase the advantages of the low power dissipation of nanomagnets.

#### 9. References

- H. Cho and E.E. Swartzlander, "Adder Designs and Analyses for Quantum-Dot Cellular Automata", IEEE Transactions on Nanotechnology, vol. 6, no. 3, May 2007.
- [2] J. Huang and F. Lombardi, "Design and Test of Digital Circuits by Quantum-Dot Cellular Automata", Artech House Publishers, 2007.
- [3] M. Xiaojun, J. Huang and F. Lombardi, "A model for computing and energy dissipation of molecular QCA devices and circuits", J. Emerg. Technol. Comput. Syst., Vol. 3, N. 4, pp. 1–30, 2008
- [4] A. Imre, G. Csabaa, G.H. Bernstein, W. Porod and V. Metlushkob, "Investigation of shape-dependent switching of coupled nanomagnets", Superlattices and Microstructures, 34, 513-518, 2003.
- [5] C. S. Lent, P. D. Tougaw, W. Porod and G. H. Bernstein "Quantum cellular automata", Nanotechnology, Vol. 4, 49, 1993
- [6] P. Tougaw, C.S. Lent, W. Porod, "Bistable Saturation In Coupled Quantum-Dot Cells". Journal Of Applied Physics 1993, 74, (5), 3558-3566.
- [7] P. Tougaw, C.S. Lent, "Dynamic behavior of quantum cellular automata", Journal Of Applied Physics 1996, 80, (8), 4722-4736.
- [8] G. Toth, C.S. Lent, "Role of correlation in the operation of quantum-dot cellular automata". Journal Of Applied Physics 2001, 89, (12), 7943-7953.
- [9] Joel L. Schiff, "Cellular Automata: A Discrete View of the World", Wiley & Sons, Inc.
- [10] R. K. Kummamuru, A. O. Orlov, R. Ramasubramaniam, C. S. Lent, G. H. Bernstein, and G. L. Snider "Operation of a Quantum-dot Cellular Automata (QCA) shift register and analysis of errors", IEEE Trans. On Electron Devices, Vol. 50, 1906, 2003
- [11] A. I. Csurgay, W. Porod, and C. S. Lent "Signal processing with near-neighborcoupled time-varying quantum-dot arrays", IEEE Trans. On Circuits and Systems, Vol. I47, 1212, 2000
- [12] A.O. Orlov, R. Kummamuru, R. Ramasubramaniam, C.S. Lent, G.H. Bern- stein, G.L. Snider Clocked Quantum-dot Cellular Automata Devices: Experi- mental Study Dept. of Electrical Engineering, University of Notre Dame, Notre Dame, Indiana, USA.
- [13] M.T. Niemier, M.J. Kontz, P.M. Kogge "A Design of and Design Tools for a Novel Quantum Dot Based Microprocessor" Presentation, University of Notre Dame, Notre Dame, Indiana, USA, 2006.
- [14] A. Khitun., K.L. Wang Multi-functional edge driven nano-scale cellular au- tomata based on semiconductor tunneling nano-structure with a self-assembled quantum dot layer Superlattices and Microstructures, 37, 55-76, 2005.
- [15] C.G. Smitha, S. Gardelisa, A.W. Rushfortha, R. Crooka, J. Coopera, D.A. Rit- chiea, E.H. Linfielda, Y. Jinb, M. Peppera Realization of quantum-dot cellular automata using semiconductor quantum dots Superlattices and Microstructures, 34, 195-203, 2003.
- [16] C.S. Lent, B. Isaksen Clocked Molecular Quantum-Dot Cellular Automata IEEE Transactions on Electron Device, vol. 50, no. 9, september 2003.
- [17] U. Lu and C. S. Lent, "Theoretical Study of Molecular Quantum-Dot Cellular Automata", J. of Computational Electronics, 4: 115118, 2005

- [18] H. Qi, S. Sharma, Z.H. Li, G.L. Snider, A.O. Orlov, C.S. Lent, T.P. Fehlner, "Molecular quantum cellular automata cells. Electric field driven switching of a silicon surface bound array of vertically oriented two-dot molecular quantum cellular automata". Journal Of The American Chemical Society 2003, 125, (49), 15250-15259.
- [19] J.Y. Jiao, G.J. Long, F. Grandjean, A.M. Beatty, T.P. Fehlner, "Building blocks for the molecular expression of quantum cellular automata. Isolation and characterization of a covalently bonded square array of two ferrocenium and two ferrocene complexes". Journal Of The American Chemical Society 2003, 125, (25), 7522-7523.
- [20] W. Porod, "Magnetic Logic Devices Based on Field-Coupled Nanomagnets", Nano & Giga 07, Tempe, AZ, 12-16 March 2007.
- [21] G. Csaba and W. Porod, "Simulation of Coupled Computing Architectures based on Magnetic Dot Arrays", Journal of Computational Electronics, Kluwer, 1:87-91, 2002
- [22] M.T. Alam, J.DeAngelis, M. Putney, X.S. Hu, W. Porod, M. Niemier and G.H. Bernstein, "Clock Scheme for Nanomagnet QCA", Proc. of IEEE Int. Conf on Nanotechnology, 2007.
- [23] M. Vacca "Magnetic QCA Nanoarchictures", Master Thesis, Politecnico di Torino, November 2008.
- [24] M. Graziano, A. Chiolerio and M. Zamboni "A Technology Aware Magnetic QCA NCL-HDL Architecture", Proc. IEEE Conf. on Nanotechnology, Genova, July 2009.
- [25] M. Mascarino "Analysis and simulation of Circuits Based Magentic QCA", Master Thesis, Politecnico di Torino, November 2009.
- [26] M. Ottavi, L. Schiano abd F. Lombardi, HDLQ: A HDL Environment for QCA Design, ACM Journal on Emerging Technologies in Computing Systems, Vol.2, No.4, 2006
- [27] M. Choi, M. Choi, Z. Patiz and N. Park, "Efficient and Robust Delay-Insensitive QCA (Quantum-Dot-Cellular-Automata) Design", Proc. IEEE Int. Symp. on Defect and Foult-Tolerance in VLSI Systems, 2006
- [28] M. Choi, Z. Patitz, B. Jin, F. Tao, N. Park and M. Choi, "Designing layout-timing independent quantum-dot cellular automata (QCA) circuits by global asynchrony", Journal of System Architecture, Elsevier, 53, 2007, pp. 551-567.
- [29] E. Tabrizizadeh, H.R. Mohaqeq and A. Vafaei, "Designing QCA Delay-Insensitive Serial Adder", Proc. IEEE International COnference on emerging trends in Engineering and Technology, 2008.
- [30] S.C. Smith "Gate And Throughput Optimizations For Null Convention Self-timed Digital Circuits" *Doctor of Philosophy, Dissertation*, University of missouri, Columbia, USA, spring term 2001.
- [31] K.M. Fant and S.A. Brandt., NULL Convention Logic<sup>TM</sup>: "A Complete and Consistent Logic for Asynchronous Digital Circuit Synthesis", Proc. Int. Conf. on Application Specific Systems, Architectures, and Processors, 1996.
- [32] S. Henderson, E.W.Johnson, J.R.Janulis and P.D. Tourgaw, "Incorporating Standard CMOS Design Process Methodologies into the QCA Logic Design Process", IEEE Trans. on Nanotechnology, Vol. 3, No.1, 2004.
- [33] http://www.model.com/
- [34] G. Csaba, P. Lugli, A. Csurgay and W. Porod, "Simulation of Power Gain and Dissipation in Field-Coupled Nanomagnets", Journal of Computational Electronics, Springer, Vol. 4, 2005.

| 56   | Cellular Automata - Innovative Modelling for Science and Engineering                                                                                                                                                                                                                            |
|------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [35] | G. Csaba, P. Lugli and W. Porod, "Power Dissipation in Nanomagnetic Logic Devices", proc. IEEE Conference on Nanotechnology, July 2004                                                                                                                                                          |
| [36] | G. Csaba, P. Lugli, A. Csurgay and W. Porod, "Simulation of Power Gain and Dissipation in Field-Coupled Nanomagnets", Journal of Computational Electronics, Springer, Vol. 4, 2005.                                                                                                             |
| [37] | G.H. Bernstein, A. Imre, V. Metlushko, A. Orlov, L. Zhou, L. Ji, G. Csaba, W. Porod,<br>"Magnetic QCA systems", <i>Microelectronics Journal</i> , Elsevier, vol. 36, 2005.                                                                                                                      |
| [38] | A. Orlov, A. Imre, G. Csaba, L. Ji, W. Porod and G.H. Bernstein, "Magnetic Quantum-Dot Cellular Automata: Recent Developments and Prospects", <i>ASP J.</i> of Nanoelectronics and Optoel., Vol3, N. 1, 2008.                                                                                   |
| [39] | J.F. Pulecio and S. Bhanja, "Reliability of Bi-stable Single Domain Nano Magnets for<br>Cellular Automata", Proc. IEEE Conference on Nanotechnology, August 2007.                                                                                                                               |
| [40] | A. Chiolerio, E. Celasco, F. Celegato, S. Guastella, P. Martino, P. Allia, P. Tiberto and F. Pirri, "Enhanced imaging of magnetic structures in micropatterned arrays of Co dots and antidots", <i>J.</i> of Magnetism and Magnetic Materials, Vol. 320, e669-e673, 2008.                       |
| [41] | R. Zhang, P. Gupta, N.K. Jha "Majority and Minority Network Synthesis With Application on Nanotechnologies" in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 26, no. 7, july 2007.                                                                        |
| [42] | T. Teod and L. Sousa QCA-LG: A tool for the automatic layout generation of QCA combinational circuits INESC-ID/IST, TU Lisbon, Portugal, IEEE, 2007.                                                                                                                                            |
| [43] | M.R.Azghadi, O. Kavehei and Keivan Navi, "A Novel Design for Quantum-dot<br>Cellular Automata Cells and Full Adders". Journal of Applied Sciences 7(22)<br>3460-3468, 2007                                                                                                                      |
| [44] | R. Zhang, K. Walus, W.Wang and G.A. Jullien, "A Method of Majority Logic Reduction<br>for Quantum Cellular Automata." IEEE Transactions on Nanotechnology Vol.3 No.4<br>December 2004.                                                                                                          |
| [45] | A. Chaudhary, D.Z. Chen, X.S. Hu, K. Whitton, M. Niemier, R. Ravichan- dran<br>Eliminating Wire Crossings for Molecular Quantum-dot Cellular Automata<br>Implementation University of Notre Dame, Notre Dame, USA, College of<br>Computing Georgia Institute of Technology, Atlanta, USA, 2005. |
| [46] | http://www.comsol.com/                                                                                                                                                                                                                                                                          |
| [47] | http://www.magoasis.com/magsimus.htm                                                                                                                                                                                                                                                            |
| [48] | M.T. Alam, M.J. Siddiq, G.H. Bernstein, M. Niemier, W. Porod, X.S. Hu "On-Chip<br>Clocking for Nanomagnet Logic Devices" University of Notre Dame, Notre Dame,<br>Indiana, USA, IEEE, 2010.                                                                                                     |
|      |                                                                                                                                                                                                                                                                                                 |