# POLITECNICO DI TORINO Repository ISTITUZIONALE

## A 64-channel waveform sampling ASIC for SiPM in space-born applications

Original

A 64-channel waveform sampling ASIC for SiPM in space-born applications / Tedesco, Silvia; Di Salvo, Andrea; Rivetti, Angelo; Bertaina, Mario. - In: JOURNAL OF INSTRUMENTATION. - ISSN 1748-0221. - ELETTRONICO. - 18:(2023). [10.1088/1748-0221/18/02/C02022]

*Availability:* This version is available at: 11583/2974496 since: 2023-01-11T08:46:01Z

Publisher: IOP Publishing

Published DOI:10.1088/1748-0221/18/02/C02022

Terms of use:

This article is made available under terms and conditions as specified in the corresponding bibliographic description in the repository

Publisher copyright IOP postprint/Author's Accepted Manuscript

"This is the accepted manuscript version of an article accepted for publication in JOURNAL OF INSTRUMENTATION. IOP Publishing Ltd is not responsible for any errors or omissions in this version of the manuscript or any version derived from it. The Version of Record is available online at http://dx.doi.org/10.1088/1748-0221/18/02/C02022

(Article begins on next page)

- 1 PREPARED FOR SUBMISSION TO JINST
- <sup>2</sup> TOPICAL WORKSHOP ON ELECTRONICS FOR PARTICLE PHYSICS TWEPP 2022
- 3 19-23 SEPTEMBER 2022
- 4 BERGEN NORWAY

## A 64-channel waveform sampling ASIC for SiPM in

- <sup>6</sup> space-born applications
- 7 S. Tedesco, $^{a,1}$  A. Di Salvo, $^{b}$  A. Rivetti $^{b}$  and M. Bertaina $^{c}$
- <sup>8</sup> <sup>a</sup>Politecnico di Torino, Corso Duca degli Abruzzi, 24, Torino, Italy
- <sup>9</sup> <sup>b</sup>IEEE Member, INFN sezione di Torino, via Pietro Giuria 1, Torino, Italy
- <sup>10</sup> <sup>c</sup> Università degli Studi di Torino, via Pietro Giuria 1, Torino, Italy
- 11 *E-mail:* silvia\_tedesco@polito.it

ABSTRACT: The architecture of a 64-channel ASIC for the readout of Silicon Photomultipliers in 12 space experiments is described. Each channel embeds a front-end amplifier with a common gate 13 topology followed by a 256 cells analogue memory with a sampling frequency of 200 MHz. A 14 single memory cell includes a storage capacitor, a single-slope Analog-to-Digital Converter (ADC) 15 with programmable resolution between 8 and 12 bits and the digital control logic. To save power, 16 the A/D conversion is carried-out only when a trigger signal is received. The trigger can either be 17 generated inside the ASIC or provided by an external source. The analogue samples are digitized in 18 parallel, thus reducing the conversion dead time. The memory cells can be arranged in a single array 19 or they can be grouped in shorter slots of 32 or 64 cells that work in a multi-buffer configuration. 20 The channels can work independently or they can be synchronised to acquire the same time-frame 21 in the full chip. The target power consumption is 5 mW/channel. The ASIC is being designed in a 22 65-nm CMOS technology. A digital-on-top flow is applied for the integration and final validation 23 of the chip. The tape-out is scheduled in the first quarter of 2023. 24

25 KEYWORDS: VLSI circuits, Front-end electronics for detector readout

<sup>&</sup>lt;sup>1</sup>Corresponding author.

## 26 Contents

| 27 | 1 | Intr | oduction       |
|----|---|------|----------------|
| 28 | 2 | ASI  | C architecture |
| 29 |   | 2.1  | Front-End      |
| 30 |   | 2.2  | Analog Memory  |
| 31 | 3 | Con  | clusions       |

## 32 1 Introduction

Silicon Photomultipliers (SiPMs) are today employed in many different fields such as High Energy 33 Physics (HEP) instrumentation [1], LIDAR [2] and Positron Emission Tomography (PET)[3]. Due 34 to their good detection efficiency, compactness and capability to work with moderate power supply 35 voltages they are becoming even more attractive also for space-born applications. SiPM are con-36 sidered, for instance, to equip on board cameras of future satellite-based cosmic ray observatories. 37 In this context, they will be used to detect the Cherenkov light produced by the interaction of 38 Ultra-High Energy Cosmic Rays (UHECRs) and neutrinos with the terrestrial atmosphere [5]. 39 Two common approaches to readout SiPMs rely on charge integration [6] or photon counting 40 technique [7]. However, these solutions do not allow studying in detail the signal waveform and, 41 as a consequence, to distinguish the signal of interest from spurious signals created by the direct 42 interaction of cosmic rays within the sensor. For the method to be effective, the waveform should 43

be captured with a sampling frequency of at least 100 Ms/s. A large dynamic range (up to 12 bits) 44 is also required as the energy of the primary particle can span several orders of magnitudes. High 45 integration density is desired to keep the overall system compact and lightweight and low power 46 dissipation is mandatory. Therefore, a single channel should offer a complete signal processing 47 chain with a power budget of only a few milliwatts. Care must be paid to radiation tolerance as 48 well, with particular emphasis on Single Event Effects. On the basis of these considerations, the 49 design of a custom ASIC optimized to read-out a SiPM-based Cherenkov radiation imager has been 50 undertaken. The key target specifications are a sampling frequency of 200 Ms/s, a maximum power 51

<sup>52</sup> consumption of 5 mW/channel and a dynamic range of 12 bits.

## 53 2 ASIC architecture

The 64-channel ASIC is being designed in a commercial 65-nm CMOS technology and must operate with a power supply of 1.2 V. The choice of the technology stems from the fact that it provides a good integration density and its radiation tolerance has been extensively studied. The straightforward approach in a waveform sampling system is having one free running ADC per channel followed by a digital signal processor. Despite the impressive progress made in ADC developments [8], [9], the

1

1

2

2

5

<sup>59</sup> use of one 12-bit ADC per channel would hardly be compatible with the target power consumption.
<sup>60</sup> Furthermore, since the flux of UHECRs is extremely low (0.1 to 100 particles per hour are expected

[10]), a continuous digitization is unnecessary. Analog memories provide instead an interesting

<sup>62</sup> alternative to capture fast transient signals occurring sparsely in time.

The block diagram of one channel is shown in figure 1. The current pulse coming from the sensor is amplified and converted into a voltage by the input amplifier. The resulting voltage is buffered into a 256-cells analog memory which is used to store temporarily the signal information.



Figure 1: Channel block diagram.

When the sampling is enabled, the cells are written with a frequency of 200 MHz and the 66 memory works as a ring buffer. If an event occurs, a trigger signal is issued and the cells enter the 67 digitization phase, otherwise they are overwritten. In order to perform background monitoring, an 68 external trigger can be sent to the chip. The analog memory can work as a single buffer or it can 69 be divided into a maximum of 8 segments of 32 cells each thus enabling multi-buffering mode. By 70 segmenting the analog memory, the data are derandomized, so the system acquires an event even if 71 the processing of the previous one is still in progress. Furthermore, the channels of the ASIC can 72 be programmed to operate in parallel (imaging mode) or independently from each others (sparse 73 mode). The digitized data are transmitted off-chip by employing a 8-channel Double Data Rate 74 (DDR) serializer operating with a frequency of 400 MHz. 75

#### 76 2.1 Front-End

The front-end amplifier is based on the common gate topology [11]. Two different circuits, shown 77 respectively in figures 2a and 2b, have been implemented to read both positive and negative input 78 pulses, thus increasing the flexibility of the chip. The front-end amplifiers are followed by a 79 discriminator (not shown in the figure) that compares the signal Vout to a programmable threshold 80 to provide a trigger. In sparse mode, each channel is triggered independently. In imaging mode, two 81 trigger modalities are foreseen: a fast OR between the channels and a topological trigger that looks 82 at the firing on nearby channels. The generated information can either be used to trigger a readout 83 sequence directly on the chip or it can be provided as primitive to an external trigger processor, that 84 looks at the trigger outputs of different ASICs before issuing a final trigger decision. 85

### 86 2.2 Analog Memory

The basic building blocks of the analogue memory is the sampling cell. Several options can be considered to digitize the sampled data. One possibility is to have a fast ADC per channel or





per group of channels. However, even using a moderate speed ADC (e.g. 20 Ms/s), 12.8  $\mu$ s are 89 needed to readout 256 cells. The Wilkinson ADC topology is attractive for its simplicity, but it has 90 long conversion time. However, it requires a limited number of hardware resources. Therefore, a 91 massive parallelism can be used to keep the overall conversion time of the memory to an acceptable 92 level. For instance, in [12] a fast sampling ASIC with an analog memory of 128 cells is described. 93 The digitization is fulfilled by using 128 10-bit single-slope ADCs placed at the periphery of the 94 chip. Therefore, all the cells in a single channel are converted in parallel. However, the scaling 95 in CMOS technologies allows developing chip with even higher integration density. Hence, in our 96 device a 12-bit single-slope ADC has been embedded direcly in each memory cell. This allows for 97 the digitization of all the samples in the ASIC in parallel thus reducing the dead time. The time 98 needed for the conversion is given by: 99

$$2^N \times T_{clk} = 20.48 \,\mu s$$
 (2.1)

where N is the resolution and  $T_{clk}$  represents the clock period of 5 ns. The resolution of 100 the converter can be programmed between 8 and 12 bits. Hence, with a lower resolution, this 101 time interval is decreased. For instance, for a 10-bit resolution (that could still adequate for our 102 purpose) the conversion shrinks to  $5.12 \,\mu s$ . However, in a waveform sampling ASIC an important 103 contribution to dead time is also given by data transmission. In fact, the digital data stream is 104 composed of a 27-bit long header and the digitized data. By selecting the maximum resolution 105 for the ADC, an amount of 3099 bits per channel must be transmitted to readout 256 cells. Even 106 using 10 Gbit/s serializer per chip the time to send the raw data out would be 19.83  $\mu$ s. To 107 increase the system modularity and thus its fault tolerance, instead of using a single fast serializer, 108 8 DDR serializers working with a 400 MHz clock have been implemented. This allows for the 109 segmentation of the ASIC in modules of 8 channels which are basically independent of each other 110 The data transmission time thus becomes  $30.99\mu s$  in the worst case in which all the 256 cells are 111 used to capture a single event. It must be pointed-out that the dead-time is not a critical parameter 112 for the application, as only a few events per second are expected to be read-out from the ASIC. 113 Data could be of course zero-suppressed and compressed on chip before transmission, but it has 114 been preferred to shift a more elaborated signal processing to the on-board FPGA. We have chosen 115

to keep the ASIC as simple as possible to reduce its design time, which is on the critical path of the

project, while more time is available to develop the FPGA firmware.



(a) Memory cell building blocks.



#### Figure 3

The building blocks of the memory cell are shown in figure 3a. Each cell includes the 118 sampling capacitor, the comparator of the ADC, some switches and a control logic (not illustrated 119 in the figure). A single Gray counter whose outputs are shared among the cells, is embedded in each 120 channel. In the sampling phase, the storage capacitor is charged to a voltage equal to  $V_{FE} - V_{ref}$ , 121 where  $V_{FE}$  is the output of the input amplifier and  $V_{ref}$  is a reference voltage. In contrast with the 122 most common architecture [13], the minus terminal of the comparator is not connected to a ramp 123 generator. In fact, this solution can deteriorate the linearity of the system because the common mode 124 of the comparators changes between the cells. A possible solution consists in fixing the threshold 125 to a steady value while charging the capacitor through a constant current generator. However, the 126 mismatch between the current sources can lead to gain variation between the cells. Hence, a single 127 ramp generator is applied to all the storage capacitor. During the digitization the top plate of the 128 capacitor is connected only to a gate terminal of a MOS transistor, so this node remains floating. 129 Hence, if a ramp generator is connected to the bottom plate, the same voltage variation is replicated 130 on the top thanks to charge conservation. When the voltage on this terminal reaches the threshold, 131 the comparator flips triggering the storage into local latches of the output of the Gray counter. This 132 allows embedding a single ramp generator which is common to all the cells in a single channel as 133 shown in ref. [13]. This alternative approach ensures a good gain uniformity among the cells. The 134 gain of the analog memory can thus be calibrated together with the front-end gain by injecting at 135 the input of the channel known pulses through a programmable pulse generator embedded on chip. 136 The offset is measured cell by cell by feeding a steady input voltage to the cells. This offset can be 137 stored into a local memory and subtracted when the cells is readout 138

The schematic of the comparator is shown in figure 4b. It has two possible states which are 139 called power-up and power-down mode. The input differential pair and its bias transistors have been 140 divided in two branches. The first branch, which is composed by  $M_1, M_2, M_7, M_8$  drives a quarter 141 of the total current and it is always on. The second branch is formed by  $M_3$ ,  $M_4$   $M_9$  and  $M_{10}$  and it 142 drives three quarter of the total bias current. The latter is powered on only during the digitization 143 phase by closing switch  $M_{10}$ . This arrangement allows for a reduction of the power dissipation 144 when the comparator is not used, while keeping constant its common mode. Simulations show that 145 this reduces kick-back effects toward the sampling capacitor when the comparator is set back to full 146



Figure 4: (a) Layout of a section. (b) Schematic of the comparator.

power mode. Before digitization, 2 clock cycles are dedicated to power up the converters and to
 switch the bottom plates of the capacitors from the fixed reference to the voltage ramp.

The final layout of the cell has a size of  $43.62 \times 15.20 \,\mu m^2$  and it is illustrated in figure 3b. 149 This sizing allows integrating the analog memory in a chip with final dimensions of  $6mm \times 4mm$ . 150 The analog cell is integrated with a digital-on-top methodology. Figure 4a reports the layout where 151 the Wilkinson ADC is included alongside the latches. They are divided into a data memory and an 152 offset memory used to store the offset of the converter. The upper part of the image depicts a section 153 in which the cells are hierarchically organized. Each section is managed by a channel controller (not 154 shown in the layout) where dedicated Finite State Machines (FSMs) are implemented. These FSMs 155 take into account the partitioning of cell array by appropriately managing the sampling, digitizing 156 and readout states. The channel controller also drives the configurable Gray counter whose output 157 is distributed to each section. The digital power was evaluated by synthesizing each block and 158 the consumption is limited to 1.3 mW per channel which includes the power contribution of the 159 serializer. 160

#### 161 **3** Conclusions

This paper presented the architecture of a 64-channel ASIC designed in a commercial 65-nm 162 CMOS technology for SiPM readout in space environment. The input current pulse is amplified, 163 converted into a voltage value and stored into a 256-cells analog memory. The memory cells allow 164 acquiring a snapshot of the incoming event with a resolution of 12 bits. Sampling and digitization 165 steps are decoupled since the conversion starts only if a trigger signal (both generated internally or 166 provided from the outside) is received. This results in a lower power consumption compared to the 167 implementation of a free-running converter. The chip flexibility has been increased by applying 168 the derandomization technique. The power consumption aims to be 5 mW/ch considering both 169 analog and digital circuits. The integration of the building blocks is ongoing and the chip tape-out 170 is scheduled at the beginning of 2023. 171

#### **References**

- [1] F. Sefkow, *The CALICE tile hadron calorimeter prototype with SiPM read-out: Design, construction and first test beam results, 2007 NSSCR* vol. 1 (2007) pg. 259-263.
- [2] A. M. Antonova, V. A. Kaplin, SiPM timing characteristics under conditions of a large background
   for lidars, Journal of Physics: Conference Series vol. 945 (2007) pg. 012012.
- [3] M.G. Bisogni, M. Morrocchi, Development of analog solid-state photo-detectors for positron
   emission tomography, Nuclear Instruments and Methods in Physics Research Section A: Accelerators,
   Spectrometers, Detectors and Associated Equipment vol. 809 (2016), pg. 140-148.
- [4] M.G. Bagliesi, et al., A custom front-end ASIC for the readout and timing of 64 SiPM photosensors,
   Nuclear Physics B-Proceedings Supplements 215.1 (2011), pg. 344-348.
- [5] A.V. Olinto, J. Krizmanic, *The Roadmap to the POEMMA Mission*, *APS April Meeting Abstracts*,
   (2021), pg. D21-006.
- [6] L. Buonanno et al., *GAMMA: a 16-channel spectroscopic ASIC for SiPMs readout with 84-dB* dynamic range, IEEE Transactions on Nuclear Science (2021), 2556-2572.
- [7] S. P. Nambboodiri et al., A Current-Mode Photon Counting Circuit for LongRange LiDAR
   Applications, 2020 IEEE 63rd International Midwest Symposium on Circuits and Systems (MWSCAS)
   (2020), pg. 146-149.
- [8] H. Liu et al., A 12-bit 200MS/s Pipelined-SAR ADC in 65-nm CMOS with 61.9 dB SNDR, 2019 IEEE
   International Conference on Electron Devices and Solid-State Circuits (EDSSC) (2019), pg. 1-2.
- [9] L. Ricci, L. Bertulessi, A. Bonfanti, A low-noise high-speed comparator for a 12-bit 200-MSps SAR
   ADC in a 28-nm CMOS process, SMACD/PRIME 2021; International Conference on SMACD and
   16th Conference on PRIME (2021), pg. 1-4.
- [10] A.L. Cummings et al., Detection of the above the limb cosmic rays in the optical Cherenkov regime
   using sub-orbital and orbital instruments, 37th Intern. Cosmic Ray Conf. (2021), 437 PoS(ICRC2021)437.
- [11] P. Carniti et al, *CLARO-CMOS*, a very low power ASIC for fast photon counting with pixellated photodetectors, Journal of Instrumentation 7.11 (2012), pg. 11026.
- [12] S. Kleinfelder, A multi-GHz, multi-channel transient waveform digitization integrated circuit, 2002
   *IEEE nuclear science symposium conference record* vol. 1 (2002), pg. 544-548.
- [13] E. Delagnes et al, A Low Power Multi-Channel Single Ramp ADC With up to 3.2 GHz Virtual Clock,
   *IEEE Transactions on Nuclear Science* vol. 54 (2007), pg. 1735-1742.