# High-Performance CMOS Front-End ASICs for SiPM Detectors and High-Frequency Ultrasound and Photoacoustic Imaging

By

Yuxuan Tang

A dissertation submitted to the Department of Electrical and Computer Engineering, Cullen College of Engineering in partial fulfillment of the requirements for the degree of

#### **DOCTOR OF PHILOSOPHY**

IN ELECTRICAL ENGINEERING

Chair of Committee: Dr. Jinghong Chen

Committee Member: Dr. Wanda Zagozdzon-Wosik

Committee Member: Dr. Yuhua Chen

Committee Member: Dr. Xin Fu

Committee Member: Dr. Jiming Peng

Committee Member: Dr. David R. Jackson

University of Houston December 2021 Copyright 2021, Yuxuan Tang

#### ACKNOWLEDGMENTS

First, I would like to express my great appreciation to my advisor, Dr. Jinghong Chen, for his continuous research supervision, guidance, assistance, patience, and valuable suggestions throughout my Ph.D. adventure. I would also like to express my sincere gratitude to my committee members, Dr. Wanda Zagozdzon-Wosik, Dr. Yuhua Chen, Dr. Xin Fu, Dr. Jiming Peng and Dr. David R. Jackson, for their valuable comments in my proposal and defense.

Friendship is indispensable in my academic journey, I would like to thank the past and present members of my research group: Yulang Feng, Zhiheng Zuo, Qingjun Fan, Bozorgmehr Vosooghi, Todd Townsend, and Hao Deng, for their technical assistance and encouragement. It is a great pleasure to work and discuss with them. I will always remember the wonderful time I spent with them.

In addition, I would like to thank Alphacore Inc. for the sponsor support and internship opportunity.

Last but not the least, I am indebted to my parents and grandparents. I could not have reached this point in my life without their unconditional love, encouragement, and support.

iii

#### ABSTRACT

Silicon photomultiplier (SiPM), as a high sensitivity photon detector, has been widely used in high energy physics, positron emission tomography imaging, and light detection and ranging applications. The slow-rising edge of standard SiPM signal, however, makes the timing measurement sensitive to noise and leads to poor timing resolution. Besides, the SiPM energy measurement utilizing charge-sensitive amplifiers suffers from high power consumption and is not suitable for array-based SiPM readout systems. To solve these issues, two hardware prototypes in a 180 nm CMOS process have been fabricated and experimentally characterized. The first prototype is a singlechannel SiPM readout featuring an on-chip fast signal generator and a customized successive-approximation-register (SAR) analog-to-digital converter (ADC). The onchip fast-signal generator sharpens the slow-rising edge of SiPM signal improving the timing resolution. The customized ADC uses the SiPM charge integrator as the ADC track-and-hold circuit lowering the ADC power consumption. Measurement results show the readout front-end achieves a timing resolution of 151 ps, while dissipating 4.02 mW of power. The second prototype demonstrates a shared SAR ADC architecture in multi-channel SiPM readout to reduce the chip area and power consumption. The ADC is shared by 16 readout channels in a time-multiplexed manner, and achieves an SFDR of 58.34 dB and an SNDR of 51.37 dB at 16 MS/s.

High-frequency (30 to 100 MHz) ultrasound and photoacoustic imaging with improved microscopic resolution opens new medical applications in ophthalmology, intravascular imaging and systemic sclerosis. To break the tradeoff between noise and wideband impedance matching, a wideband low-noise amplifier (LNA) with noise and distortion cancellation is developed. The LNA employs a resistive shunt-feedback structure with feedforward noise-canceling technique to accomplish both wideband impedance matching and low-noise performance. A complementary CMOS topology is also developed to cancel the second-order harmonic distortion and enhance the linearity. A front-end including the proposed LNA and a variable gain amplifier is designed and fabricated in a 180 nm CMOS process. At 80 MHz, the front-end achieves an input-referred noise density of 1.36 nV/sqrt(Hz), an  $S_{11}$  better than -16 dB, and a total harmonic distortion of -55 dBc while consuming 37 mW of power.

# **TABLE OF CONTENTS**

| ACKNOWLEDGMENTSiii                                         |
|------------------------------------------------------------|
| ABSTRACTiv                                                 |
| TABLE OF CONTENTS vi                                       |
| LIST OF TABLES ix                                          |
| LIST OF FIGURES x                                          |
| CHAPTER I Introduction1                                    |
| 1.1 SiPM Readout1                                          |
| 1.1.1 SiPM Background                                      |
| 1.1.2 SiPM Readout Design Challenges                       |
| I. Timing Measurement 5                                    |
| II. Energy Measurement                                     |
| 1.1.3 State-of-Arts of SiPM Readout 10                     |
| 1.2 High-Frequency Ultrasound and Photoacoustic Readout    |
| 1.2.1 Piezoelectric Ultrasound Transducer Background       |
| 1.2.2 HFUS and PA Readout Design Challenges 17             |
| 1.2.3 State-of-Arts of LNA for HFUS and PA Imaging Readout |
| 1.3 Main Contributions                                     |
| 1.4 Dissertation Organization                              |

| CHAPT   | 'ER   | II Single-Channel SiPM Readout ASIC with On-chip     | Fast Pulse  |
|---------|-------|------------------------------------------------------|-------------|
| Generat | ion   | and Customized SAR ADC                               |             |
| 2.1     | Тор   | Level Architecture                                   |             |
| 2.2     | Circ  | cuit Design of Key Building Blocks                   |             |
| 2.2.    | 1     | Input Stage                                          |             |
| 2.2.    | 2     | On-Chip HPF                                          |             |
| 2.2.    | 3     | Current Discriminator                                |             |
| 2.2.    | 4     | Detection and Integration Control Logic              |             |
| 2.3     | Cus   | tomized SAR ADC for SiPM Readout                     |             |
| 2.3.    | 1     | Introduction of SAR ADC                              |             |
| 2.3.    | 2     | Proposed Customized SAR ADC                          |             |
| 2.3.    | 3     | CDAC                                                 |             |
| 2.3.    | 4     | Two-Stage Dynamic Comparator                         | 40          |
| 2.3.    | 5     | Synchronous SAR Logic                                | 44          |
| 2.4     | Mea   | asurement Results                                    |             |
| СНАРТ   | ER    | III DAQ with Multi-channel SiPM Readout ASIC and Hig | ghly-linear |
| FPGA-b  | oaseo | d TDC                                                | 54          |
| 3.1     | Тор   | Level Architecture                                   | 56          |
| 3.2     | Key   | Building Blocks of the Readout ASIC                  | 58          |
| 3.2.    | 1     | Input Stage                                          | 58          |
| 3.2.    | 2     | SAR ADC and Serializer                               | 59          |
| 3.3     | FPC   | GA-based TDC and Linearity Improvement               | 61          |
| 3.4     | Mea   | asurement Results                                    | 67          |

| CHAPTER IV Wideband Noise and Harmonic Distortion Canceling Readout For |
|-------------------------------------------------------------------------|
| High-Frequency Ultrasound and Photoacoustic Imaging                     |
| 4.1 PVDF Ultrasound Transducer Model75                                  |
| 4.2 LNA Design and Analysis                                             |
| 4.2.1 Resistive Shunt Feedback                                          |
| 4.2.2 Feedforward Noise Cancellation                                    |
| 4.2.3 Complementary CMOS Topology                                       |
| 4.2.4 Current-Reuse Technique                                           |
| 4.3 LNA Design with Simulation Results                                  |
| 4.4 Measurement Results                                                 |
| CHAPTER V Conclusion and Future Works99                                 |
| 5.1 Conclusion                                                          |
| 5.2 Future Directions 101                                               |
| REFERENCES 103                                                          |

## LIST OF TABLES

| Table 2.1. Comparator Noise and Offset Performance                       | 43 |
|--------------------------------------------------------------------------|----|
| Table 2.2. ADC Performance Comparison for SiPM Readout Applications      | 48 |
| Table 2.3. Single-channel SiPM Readout Performance Summary               | 53 |
| Table 2.4. Performance Comparison of the SiPM Readout                    | 53 |
| Table 3.1. Multi-channel SiPM Readout Performance Summary and Comparison | 71 |
| Table 3.2. FPGA-based TDC Performance Summary and Comparison             | 73 |
| Table 4.1. Simulated THD of the LNA                                      | 92 |
| Table 4.2. Ultrasound Front-end Performance Summary and Comparison       | 98 |

## **LIST OF FIGURES**

| Fig. 1.1. Block diagram of SiPM readout system                                         |
|----------------------------------------------------------------------------------------|
| Fig. 1.2. Avalanche, quench, and recovery of a Geiger-mode operated SPAD               |
| Fig. 1.3. Electrical model of SiPM microcell                                           |
| Fig. 1.4. Illustration of (a) SPE response; and (b) SiPM output current signal7        |
| Fig. 1.5. SensL's proprietary SiPM with fast output                                    |
| Fig. 1.6. Conventional SiPM readout with CSA and Wilkinson ADC12                       |
| Fig. 1.7. QTC-based SiPM readout                                                       |
| Fig. 1.8. Block diagram of ultrasound imaging readout system                           |
| Fig. 1.9. Electrical model of piezoelectric sensor                                     |
| Fig. 1.10. CG-CS NC LNA 19                                                             |
| Fig. 1.11. Resistive shunt-feedback feedforward NC LNA                                 |
| Fig. 2.1. (a) Simplified block diagram of the readout front-end with the on-chip fast  |
| signal generation and customized SAR ADC; and (b) its timing diagram 26                |
| Fig. 2.2. Input current buffer with on-chip HPF                                        |
| Fig. 2.3. Simulated input impedance reduction and bandwidth improvement of the input   |
| stage                                                                                  |
| Fig. 2.4. The simulated equivalent impedance of the on-chip HPF                        |
| Fig. 2.5. Current discriminator with enhanced stability                                |
| Fig. 2.6. Simplified block diagram of event detection and integration control logic 33 |

| Fig. 2.7. Basic architecture of an <i>N</i> -bit SAR ADC                            |
|-------------------------------------------------------------------------------------|
| Fig. 2.8. Flow graph of conventional SAR ADC                                        |
| Fig. 2.9. Proposed customized 10-bit SAR ADC for SiPM readout                       |
| Fig. 2.10. Conversion mechanism of the customized SAR ADC                           |
| Fig. 2.11. Monte Carlo Simulation results for the unit MIM capacitor of CDAC 39     |
| Fig. 2.12. Strong-Arm based dynamic comparator                                      |
| Fig. 2.13. Two-stage dynamic latched comparator                                     |
| Fig. 2.14. Synchronous SAR logic                                                    |
| Fig. 2.15. PCB and die photo of the SiPM readout ASIC                               |
| Fig. 2.16. Measured DNL and INL of the SAR ADC 46                                   |
| Fig. 2.17. Measured ADC output spectrum at 1 MS/s with a near-Nyquist input signal. |
|                                                                                     |
| Fig. 2.18. Measured SNDR and SFDR versus input frequency at 1 MS/s sampling rate.   |
|                                                                                     |
| Fig. 2.19. Measured SNDR and SFDR versus sampling frequency at Nyquist rate 48      |
| Fig. 2.20. Measurement setup (A: Digital Signal Analyzer; B: Clock Source; C: Laser |
| Pulser; D: Dark Box with SensL SiPM evaluation board MICROFC-SMA-                   |
| 30035-GEVB; E: Power Supply; F: Readout ASIC)                                       |
| Fig. 2.21. Measured laser pulse and the corresponding standard SiPM output current  |
| signal                                                                              |
| Fig. 2.22. Measured timing resolution of the readout ASIC with the HPF capacitance  |
| settings being (a) 1 pF; (b) 4 pF; and (c) 7 pF                                     |
| Fig. 2.23. Measured ADC output codes versus different input charges                 |

| Fig. 3.1. Top-level architecture of the proposed multi-channel SiPM DAQ system 55            |
|----------------------------------------------------------------------------------------------|
| Fig. 3.2. Timing diagram of the proposed 16-channel SiPM readout ASIC 57                     |
| Fig. 3.3. Simulated DNL and INL comparison of the current buffer w/o and w/ cascode          |
| current mirror structure in 200 pC input charge range                                        |
| Fig. 3.4. Serialization of ADC output and channel position bits                              |
| Fig. 3.5. Initial TDC design with 400 MHz coarse clock, single register, and latch stage.    |
|                                                                                              |
| Fig. 3.6. Initial TDC code density results with independent components of nonlinearity.      |
|                                                                                              |
| Fig. 3.7. Synchronized-enable pulse detector                                                 |
| Fig. 3.8. (a) Xilinx representation of global clock path to each register clock input within |
| SLICEL; (b) Average path length from global clock to register clock inputs                   |
| (post-layout simulation)                                                                     |
| Fig. 3.9. (a) Four-chain parallel TDL; (b) DNL comparison of 1-chain, 2-chain, and 4-        |
| chain TDC65                                                                                  |
| Fig. 3.10. RMS resolution comparison of 1-chain versus 4-chain TDC with X axis offset        |
| due to significant INL in 1 chain                                                            |
| Fig. 3.11. Die photo of the 16-channel SiPM readout ASIC                                     |
| Fig. 3.12. Measured ADC output spectrum at 16 MS/s sampling rate with a near-Nyquist         |
| input signal67                                                                               |
| Fig. 3.13. Measured SNDR and SFDR of ADC versus input frequency                              |
| Fig. 3.14. Measured SNDR and SFDR of ADC versus sampling rate                                |
| Fig. 3.15. SiPM readout ASIC measurement setup                                               |

| Fig. 3.16. Measured 16-channel averaged ADC output codes versus input charges 69                                    |
|---------------------------------------------------------------------------------------------------------------------|
| Fig. 3.17. Measured 16-channel averaged DNL and INL in 800 pC input charge range.                                   |
|                                                                                                                     |
| Fig. 3.18. FPGA-based TDC measurement setup                                                                         |
| Fig. 3.19. Measured DNL and INL of the 4-chain TDC                                                                  |
| Fig. 4.1. (a) Schematic diagram of the PVDF transducer; (b) photograph of the PVDF                                  |
| transducer and its electrical model75                                                                               |
| Fig. 4.2. Measured output impedance of the PVDF transducer76                                                        |
| Fig. 4.3. (a) Measurement setup for the PA response of the PVDF transducer; (b)                                     |
| recorded PA signal induced by laser pulse onto black tape                                                           |
| Fig. 4.4. Resistive shunt-feedback amplifier enabling high bandwidth and wideband                                   |
| impedance matching78                                                                                                |
| Fig. 4.5. Block diagram of feedforward noise-canceling technique                                                    |
| Fig. 4.6. Noise-canceling resistive shunt-feedback LNA                                                              |
| Fig. 4.7. Complementary resistive shunt-feedback amplifier                                                          |
| Fig. 4.8. Simulated $i_{ds}$ , $g_m$ and $g'_m$ of a 240- $\mu$ m/0.18- $\mu$ m PMOS $M_P$ and a 120- $\mu$ m/0.18- |
| $\mu$ m NMOS $M_N$ for the complementary resistive shunt-feedback amplifier. 86                                     |
| Fig. 4.9. Current-reuse resistive shunt-feedback amplifier                                                          |
| Fig. 4.10. Proposed resistive shunt-feedback LNA with the feedforward noise-canceling                               |
| technique and the complementary topology                                                                            |
| Fig. 4.11. Simulated $S_{11}$ of the proposed LNA                                                                   |
| Fig. 4.12. Simulated frequency response of the proposed LNA                                                         |
| Fig. 4.13. Simulated input-referred voltage noise density of the proposed LNA91                                     |

| Fig. 4.14. Block diagram of the HFUS and PA readout ASIC.                                | <del>)</del> 2 |
|------------------------------------------------------------------------------------------|----------------|
| Fig. 4.15. Die photo of the HFUS and PA readout ASIC.                                    | <del>)</del> 3 |
| Fig. 4.16. (a) Measurement setup; (b) $S_{11}$ and frequency response measurement; and ( | (c)            |
| noise and THD measurement                                                                | 94             |
| Fig. 4.17. Measured $S_{11}$ of the front-end                                            | <del>9</del> 5 |
| Fig. 4.18. Measured frequency response of the front-end                                  | 96             |
| Fig. 4.19. Measured input-referred voltage noise density of the front-end                | <del>)</del> 7 |
| Fig. 4.20. Measured THD of the front-end.                                                | <del>9</del> 7 |

### **CHAPTER I**

### INTRODUCTION

Complementary metal–oxide–semiconductor (CMOS) readout applicationspecific integrated circuits (ASICs) are widely applied in signal conditioning applications for receiving the signals from detectors and conducting the signals amplification/digitization. CMOS readout ASICs have the attributes of high level of integration, low power consumption, and high reliability. For different applications, the readout ASICs are needed to be designed with various architectures and different techniques to properly achieve the signal measurement and interpretation with the performance trade-offs among noise, power, linearity, and bandwidth.

In this dissertation, two readout ASIC designs for silicon photomultiplier (SiPM) detectors and one for ultrasound transducers are presented. Particularly, the SiPM readout ASICs are designed for high energy physics (HEP) and positron emission tomography (PET) imaging, and the ultrasound readout ASIC is designed for high-frequency ultrasound (HFUS) and photoacoustic (PA) imaging.

#### **1.1 SiPM Readout**

In recent years, SiPM, as a high sensitivity photon detector, has been widely adopted in many important applications, such as HEP, PET imaging, and light detection and ranging (LIDAR) [1, 2, 3, 4]. The block diagram of an *N*-channel SiPM readout



Fig. 1.1. Block diagram of SiPM readout system.

system is shown in Fig. 1.1. In response to the absorption of a single photon, SiPM can produce a current pulse with several tens nanoseconds duration containing  $10^5$  to  $10^6$ electrons [5]. To identify the arriving time and energy level of the incident photon/photons, the SiPM readout needs to conduct the timing and energy measurements of the current pulse generated by SiPM detector, and then transmit the measured data to back-end data acquisition (DAQ) system for further processing.

#### 1.1.1 SiPM Background

Historically, the most widely used photosensors in PET and HEP were photomultiplier tubes (PMTs) [6], which were featuring with high gain, low noise, and fast response. However, as formed by vacuum tubes, PMTs intrinsically are bulky and fragile. In addition, PMTs require bias voltage up to thousands of volts and are sensitive to magnetic fields. In contrast, SiPMs, formed by arrays of single-photon avalanche diodes (SPADs) in series with quenching resistors, are much compact and robust, require a much lower bias voltage (~ 30 V) [7], and have immunity to magnetic fields. For the above reasons, SiPMs have been proposed as a highly attractive alternative to PMTs.

As shown in Fig. 1.1, SiPM is consist of an array of parallel-connected microcells, and a typical SiPM has microcell densities of between 100 and several 1000 per  $mm^2$ , depending upon the size of the microcell [7]. Each microcell is composed by a SPAD in series with a quenching resistor. All the SPADs are operated in the Geiger mode, i.e., are reverse-biased slightly above their breakdown point, and a high electric field (>  $5 \times 10^5$  V/cm) is generated within the depletion region across the p-n junction in each SPAD. When a photon is absorbed by the SPAD, an electron-hole pair will be generated within the depletion region and subsequently accelerated towards anode and cathode by the strong electrical field. When the kinetic energy of the electron/hole is sufficient to create secondary electron-hole pairs, ionization-impact avalanche is triggered in this SPAD, and self-perpetuating ionization cascade will spread throughout the depletion region. The SPAD will break down and become conductive, and a current flow will be generated. Once the current flow is generated, a quenching mechanism is started to interrupt the avalanche process with the serial-connected quenching resistor. During the conductive period, the quenching resistor not only limits the current drawn by the diode, but also occupies a large voltage drop which lowers the effective voltage across the diode to a value below its breakdown voltage, thus halting the avalanche. At this point, the SPAD will recharge back to the bias voltage, and be available to detect subsequent photons. Fig. 1.2 [8] illustrates the cycle of avalanche, quench, and recovery



Fig. 1.2. Avalanche, quench, and recovery of a Geiger-mode operated SPAD.

of Geiger-mode operated SPAD. A simple description of the phenomena can be found in [9].

It should be noted that the electron-hole pair triggering the avalanche can be not only generated by a photon but also by thermal agitation. The first case corresponds to the detection of a photon, whereas the second case is regarded as a false triggering and known as the dark-count event.

#### **1.1.2 SiPM Readout Design Challenges**

Timing and energy information of the original photon, which triggers the ionization-impact avalanche, is important. In HEP application [2], the energy information helps to confirm the photon is directly generated from the high energy particle collision, and the timing information helps to track the point where the collision occurred. In PET application [3], the energy information helps to verify the photon is directly generated from the annihilation of positron-electron pair, and the timing information helps to reckon the place where the annihilation took place to reveal the

location of tumor. In practice, the original photons, generated from annihilation or particle collision, will be detected by scintillators, which absorb the original photons and emit a burst of light (> 30 k light photons) [10]. SiPMs/PMTs are then utilized to detect these light photons.

#### I. Timing Measurement

In a SiPM readout system, the readout ASIC needs to measure the current signals of SiPM to extract the timing and energy information of the photons. The moment when the current signal crosses a predetermined threshold will be stamped as the arrival time, and the time resolution/jitter  $\sigma_t$  is defined as [11]:

$$\sigma_{\rm t} = \frac{\sigma_{\rm i}}{{\rm d}i/{\rm d}t},\tag{1.1}$$

where  $\sigma_i$  is the input RMS noise and di/dt is the signal slope at the discrimination timing point.

SiPMs have three major defects in terms of timing performance. Firstly, the microcells of SiPM have a long-tailed SPE response due to the quenching RC time constant. The SiPM output signal is the convolution of the SPE responses with time distribution of the scintillation photons arriving at the detector. The long tail of the SPE response effectively reduces the rising-edge slope (di/dt) of the SiPM output signal [7, 12], making the timing resolution  $\sigma_t$  be sensitive to noise thus degrading the timing accuracy. Secondly, the intrinsic capacitance of an SiPM is very exceptionally large, for example, the capacitance of a 3 mm × 3 mm SiPM cell could be higher than 300 pF [13, 14]. Due to the large capacitance, the input impedance of the front-end should be kept much lower to make the input bandwidth high enough to cover the bandwidth of the

signal, otherwise both the timing and energy resolutions will be degraded by the "lowpassed" signal. Thirdly, SiPMs also have a much higher dark-count rate, which inevitably leads to a much larger dark-count noise and contributes a non-negligible component to the total noise  $\sigma_i$ . In the SiPM readout system, the dark-count noise from the SiPM detector usually dominates the overall noise performance. With a typical rate, dark count noise can cause a charge variance of a few pC for the measured signal while the thermal noise of the circuits can only contribute a few fC [15].

To further evaluate the SiPM output current pulse, an electrical model [8, 16] of SiPM microcell is shown in Fig. 1.3.  $C_d$  represents the capacitance of depleted region of the diode, which is depends on the size of the microcell and the applied voltage, with a typical value ranging from 10 fF to 100 fF.  $R_q$  is the integrated quenching resistor, usually between 100 k $\Omega$  and 500 k $\Omega$ .  $R_s$  is the series resistance when the diode is in conductive mode with a value of around 1 k $\Omega$ .  $V_{BD}$  stands for the breakdown voltage of the SiPM and is typically in the 30 V range. The microcell is biased with a voltage  $V_{bias}$ , which is slightly higher than  $V_{BD}$  for Geiger mode operation. The switch models the avalanche occurrence.



Fig. 1.3. Electrical model of SiPM microcell.

When there is no avalanche takes place, the switch is open and  $V_{\text{bias}}$  is directly presented across the depleted region capacitor  $C_d$ . Once an avalanche is triggered by a photon or dark-count event, the switch closes, and  $C_d$  is rapidly discharging to a value of (noticing  $R_S \ll R_q$ )

$$V_{\rm BD} + (V_{\rm bias} - V_{\rm BD}) \frac{R_{\rm S}}{R_{\rm S} + R_{\rm q}} \approx V_{\rm BD}, \qquad (1.2)$$

with a time constant of

$$\tau_{\rm rise} = C_{\rm d} \cdot \left( R_{\rm S} \parallel R_{\rm q} \right). \tag{1.3}$$

Then the current flowing in the microcell will be equal to

$$I_{\rm OUT} = \frac{V_{\rm bias} - V_{\rm BD}}{R_{\rm S} + R_{\rm q}} = \frac{V_{\rm OV}}{R_{\rm S} + R_{\rm q}},$$
(1.4)

where the overvoltage  $V_{\text{OV}}$  is defined as the excess voltage applied to a microcell with reference to its breakdown, i.e.,  $V_{\text{bias}} - V_{\text{BD}}$ . Typically,  $V_{\text{OV}}$  ranges from 2 to 6 V, and results in the current ( $I_{\text{OUT}}$ ) with an order of 10  $\mu$ A, which is low enough for statistical



Fig. 1.4. Illustration of (a) SPE response; and (b) SiPM output current signal.

fluctuation of charge carriers to ensure the avalanche quenching and to bring back the circuit to the idle state with the open switch. With the open switch, the depletion capacitance will be recharged to  $V_{\text{bias}}$  with a time constant equal to

$$\tau_{\rm fall} = C_{\rm d} \cdot R_{\rm q},\tag{1.5}$$

which is much larger than  $\tau_{rise}$  as  $R_S \ll R_q$ . Thus the  $I_{out}$  of the microcell will be shaped as a double exponential pulse with  $\tau_{rise}$  as the time constant of the rising edge, and  $\tau_{fall}$ as the time constant of the falling edge. Typically,  $\tau_{rise}$  is 1 ns, and  $\tau_{fall}$  is around 200 ns. Consequently, as the superposition of multiple SPE responses from triggered microcells, the SiPM output current signal is also shaped as double exponential pulse but with a larger rising time constant and a falling time constant, as depicted in Fig. 1.4 [12].

As presented in Eq. (1.1), the timing resolution  $\sigma_t$  is proportional to the noise  $\sigma_i$ , and is inversely proportional to signal slope di/dt. From circuit design perspective, as the noise  $\sigma_i$  is dominated by the dark-count noise of SiPM detector and the noise from circuits is negligible, the focus for timing resolution improvement should be put on the approaches for how to sharpen the signal slope di/dt.

#### **II. Energy Measurement**

Energy information is used to verify the detected triggering event is caused by primary photons or by Compton scattered photons. Primary photons are referred as the directly generated photons from annihilation or particle collision, while Compton scattered photons, generated from the interactions between the primary photons and matters, have less energy and different moving directions. As only the primary photons can help to track the points where annihilation or particle collision took place, SiPM readout systems are designed to discriminate primary photons from Compton scattered photons based on the energy of the detected photons.

In the field of radiation detections, energy resolution is defined as the ability of the detector to accurately determine the energy of the incoming particles and expressed as a percent of the energy of the incoming photons [17]. For example, if the energy resolution of a detector is 10%, and only 500 keV photons are striking the SiPM, the readout system will "see" photons ranging from 475 keV to 525 keV.

For SiPM readout systems, the energy resolution can be written as [18]:

$$(\Delta E / E)^{2} = (\delta_{\rm sc})^{2} + (\delta_{\rm p})^{2} + (\delta_{\rm st})^{2} + (\delta_{\rm n})^{2}, \qquad (1.6)$$

where  $\delta_{sc}$  is the intrinsic energy resolution of scintillator,  $\delta_{p}$  is the scintillating transfer resolution,  $\delta_{st}$  is the statistical contribution of photodetector,  $\delta_{n}$  is the noise contributed by photodetector and circuits.

The energy resolution of scintillator,  $\delta_{sc}$ , which highly depends on the materials and structure, is typically around 8% [19] in PET and HEP applications.

The scintillating transfer resolution,  $\delta_p$ , is defined as the probability variance that a photon from the scintillator can successfully arrives the photodetector, which is highly depending on the quality of the optical coupling of the scintillator and SiPM and can be neglected when comparing with other components of energy resolution [18].

The statistical contribution of SiPM,  $\delta_{st}$ , can be described as

$$\delta_{\rm st} = 2.355 \times (ENF / PHE)^{1/2},$$
 (1.7)

where *ENF* is the excess noise factor and is at least 2 or larger for SiPMs, and *PHE* is the number of photoelectrons generated from the fired SiPM with a typical number of 5000 [20]. With calculation, the typical value of  $\delta_{st}$  for SiPM is higher than 4.7%.

Dark-count noise of SiPM can reach up to a few pC charges, and contributes the major part of the noise  $\delta_n$ , while the noise from circuits is negligible. In [15],  $\delta_n$  presents a value of about 4% in the worst case.

Based on the analysis above, the energy resolution of SiPM readout systems can be > 10% in theory. With a better intrinsic energy resolution of scintillator, the energy resolution of the entire system can be even better. However, the linearity of readout front-ends for charge integration and the resolution of ADC for integrated voltage digitization also will affect the measurement accuracy of energy resolution. In practice, the measured energy resolutions are higher than 11% as shown in [15, 21, 22]. In addition, as dark-count noise of SiPM is highly proportional to the working temperature [7], the readout system with high power consumption inevitably suffers the degradation of timing and energy measurement accuracy. Therefore, a low-power highly-linear SiPM front-end with high-resolution ADC is essential for the measurement of good energy resolution.

#### 1.1.3 State-of-Arts of SiPM Readout

To improve the accuracy of timing measurement, both a fast SiPM output signal and high-bandwidth front-end are required [23, 24, 25]. In [12], SensL has proposed a proprietary SiPM structure to improve the timing resolution, as shown in Fig. 1.5. By adding a small capacitor paralleled with the quenching resistor in each microcell of the



Fig. 1.5. SensL's proprietary SiPM with fast output.

SiPM sensor, a fast output is generated by high-passing the standard SiPM output. The fast output carries only a small portion of the signal charges, while it has a much shorter SPE response and a much sharper slope of output current pulse di/dt. With the proprietary SiPM, a readout system [22] achieves a timing resolution of 363 ps that manifests a significant improvement a significant improvement designs [15, 21], where the timing resolutions are no better than 660 ps when only the standard SiPM output signals are utilized. Nevertheless, the drawback of the SensL's approach is obvious. As both the fast and standard outputs are required, the I/O pin counts of the SiPM detectors and the readout ASICs are almost doubled, which result in costly and cumbersome designs for large photon detection systems, which are often composed of thousands of SiPM detectors and readout circuitries.

In conventional SiPM/PMT readout designs, power-hungry charge sensitive amplifiers (CSAs) and high-linear but low-speed Wilkinson ADCs are widely adopted for conducting the energy measurements [21], as shown in Fig. 1.6. CSAs are used for charge integration of the SiPM currents and Wilkinson analog-to-digital converters



Fig. 1.6. Conventional SiPM readout with CSA and Wilkinson ADC.



Fig. 1.7. QTC-based SiPM readout.

(ADCs) are used to digitize the integrated voltages. Such an approach unfortunately suffers from high power consumption and long conversion period. For example, a power consumption of 15 mW per channel is reported in [21] with a conversion period of 7.4

 $\mu$ s, where each channel contains a dedicated CSA and a Wilkinson ADC. Besides, the high-power dissipation also inevitably leads to a higher temperature, causing SiPMs to generate more dark count noise, which can severely deteriorate the timing and energy measurement accuracy [7], especially for some HEP applications which require liquid argon temperature (~ -180 °C) [26]. In [15], a low-power current-mode based chargeto-time convertor (QTC) method has been developed as shown in Fig. 1.7, which employs cascaded current mirrors for directly charge integration without CSA and utilizes high-linear QTC to convert the integrated voltage into another timing pulse (different with the timing pulse containing the timing information of photon triggering). This approach can extract both the timing and energy information from timing pulses with TDC-only circuits, which helps to simplify the circuitry design in some extent, and measurements show it achieves a power consumption of 10 mW per channel, and a conversion period of 4  $\mu$ s (mainly limited by the high-linear QTC process).

#### 1.2 High-Frequency Ultrasound and Photoacoustic Readout

With the newly developed high-frequency (30 to 100 MHz) ultrasound transducers [27], such as polyvinylidene fluoride (PVDF) piezoelectric transducers, HFUS and PA imaging has been widely investigated in many clinical applications [28, 29, 30], and regarded as the next frontier in ultrasound imaging. Conventional ultrasound imaging has a limited spatial resolution due to the low operation frequencies (2 to 15 MHz), while HFUS and PA imaging, with the improved microscopic resolution, has opened new medical applications in the fields of ophthalmology, dermatology, and intravascular imaging (IVUS) and systemic sclerosis (SSC).



Fig. 1.8. Block diagram of ultrasound imaging readout system

In an ultrasound imaging readout system [31], as shown in Fig. 1.8, the reflected ultrasound echoes are firstly converted into analog electrical signals with the ultrasound transducer, then properly amplified and digitized by the receiving readout, and are processed at the back-end for imaging reconstruction. As the bridge between the "realworld" signal and digital processing, the receiving readout consists of a low-noise amplifier (LNA) for pre-amplifying the analog electrical signals, a variable-gain amplifier (VGA) for compensating the dynamic range (DR) difference between the LNA and the ADC, an anti-aliasing filter (AAF) for removing out-band signals, and an ADC for digitizing the signals.

#### **1.2.1 Piezoelectric Ultrasound Transducer Background**

Piezoelectric transducer, as one of the most typical types of ultrasound sensor, is a device that can measure changes in pressure, strain, or force by converting them into electrical signals with its intrinsic piezoelectricity. The basic theory of piezoelectricity is based on the electrical dipoles of dielectric crystals [32]. At the molecular level, a piezoelectric crystal typically has an ionic bonded structure. In rest mode, the electrical dipoles formed by the positive and negative ions cancel each other due to the symmetry of the crystal structure, and no electric field is not formed. When stressed, the piezoelectric crystal is slightly deformed and the symmetry is lost, leading to the generation of a net dipole moment and an electric field formed across the crystal. Consequently, electrical charges are generated on the surfaces of the crystal and are proportional to the pressure applied. With a reciprocating force applied onto the transducer, an AC voltage can be measured across the terminals of the device.

Since late 1940s, ultrasound imaging with piezoelectric transducers has been widely used as a diagnostic imaging technique for anatomic visualization and assessment of cardiac function [33]. The high strain sensitivity and low triggering threshold makes piezoelectric materials a much better choice for sensing the low-energy ultrasonic waves, when comparing with resistive and capacitive materials. It should be noted that although the electrical charges/voltages are generated due to compression, piezoelectric sensors show almost zero deformation. Benefiting from its ruggedness, piezoelectric sensors typically have an excellent linearity over a wide dynamic range. Besides, piezoelectric materials are insensitive to electromagnetic fields and radiation, enabling the measurements of ultrasound imaging under harsh conditions.

Conventionally, ultrasound imaging works under 20 MHz and has a relatively low spatial resolution, mainly used on surface organ scanning, such as liver and thyroid. With the exploration of higher special resolution, research for high-frequency



Fig. 1.9. Electrical model of piezoelectric sensor.

ultrasound imaging has been actively conducted in the applications ranging from imaging the eye and skin to small animal imaging [27].

To meet the demand of high-frequency ultrasound transducers, multiple piezoelectric materials have been investigated, such as the widely used ceramic material lead zirconate titanate (PZT) and the well-known polymeric material PVDF [27]. PZT transducers can provide higher sensitivity and higher sonic velocity, while PVDF transducers have unique advantages over other materials including broader bandwidth, lower cost, and better acoustic impedance matching to tissue, which contribute PVDF transducers a better choice for high-frequency ultrasound imaging.

The electrical model of piezoelectric sensor is shown in Fig. 1.9 [34, 35]. The voltage source V is directly proportional to the applied pressure/strain. The capacitor  $C_e$  is inversely proportional to the transducer elasticity, while  $L_m$  represents the seismic mass and inertia of the sensor itself. The capacitor  $C_0$  is the static capacitance of the transducer, which results from an inertial mass of infinite size.  $R_i$  is the insulation resistance (resistance between signal and ground) of the transducer element itself, and

the value of  $R_0$  in dry and clean conditions is typically exceptionally large (>  $10^{12} \Omega$ ) and can be neglected in the equivalent circuit. Thus the output impedance of piezoelectric sensor is typically decided by  $C_e$ ,  $L_m$ , and  $C_0$ , and varying with working frequency.

In terms of thin-film PVDF transducer, benefiting from its good mechanical flexibility and low acoustic impedance [27], the output impedance of PVDF transducer can be easily controlled by shaping the size and thickness. In Chapter IV, a PVDF transducers with 50  $\Omega$  output impedance over 30 MHz to 120 MHz is adopted for the high-frequency ultrasound and photoacoustic readout design.

#### **1.2.2 HFUS and PA Readout Design Challenges**

As the very first block of the front-end, LNA dominates the performance of impedance matching, noise, and bandwidth, and is regarded as the most critical block in the readout ASIC. Therefore, great focus has been put on the design challenges of wideband low-noise LNAs in this work. To support HFUS and PA imaging, the LNA needs to achieve low noise and high bandwidth simultaneously. Conventionally, the LNA is designed as either a charge-sensitive amplifier (CSA) or a voltage amplifier. Although the CSA provides low-noise performance [36, 37], yet, the feedback loop formed by the bleeding resistor and the feedback capacitor significantly limits the achievable amplifier in [38], achieves large bandwidth and wideband impedance matching, however, the noise performance is poor due to the fixed transconductance  $(g_{m1})$  of the input transistor for impedance matching. To improve noise performance of

voltage-mode amplifiers, noise-canceling (NC) techniques [39, 40, 41] have been recently explored. Reference [40] proposes an LNA utilizing a combination of a common-gate (CG) amplifier and a common-source (CS) amplifier for noise cancellation. Nonetheless, the pseudo-differential structure is prone to gain mismatch between the CG and CS gain stages. Based on a resistive shunt-feedback amplifier structure, our recent work [41] also demonstrates a noise-canceling wideband LNA, where a common-source auxiliary amplifier is employed to generate an in-phase signal and an out-of-phase noise with respect to those of the main amplifier for noise cancellation.

As ultrasound signals are normally single-ended, the LNAs are limited to be single-ended structure, which innately suffers even-order nonlinearity compared with their fully differential counterparts. Bulky transformer or noisy single-to-differential conversion circuit implemented before the LNA can help to improve the linearity but significantly degrade the NF and sensitivity [42]. The approach of LNA linearity improvement for HFUS and PA imaging should be simple and without noise penalty.

In addition, LNAs in conventional ultrasound receivers normally have high power consumption for suppressing the thermal noise, which are not suitable for the trend of portable designs. Moreover, in beamforming applications where multiple channels of LNAs are integrated into a single chip, the heat generated by the LNAs is an unignorable factor affecting the chip reliability. Therefore, low-power LNAs are highly desirable for ultrasound imaging systems.

#### 1.2.3 State-of-Arts of LNA for HFUS and PA Imaging Readout

In [39, 40, 41], open-loop voltage mode amplifiers employing noise canceling techniques have been developed to achieve both wideband input matching and good noise performance. Reference [40] proposes an LNA structure utilizing a combination of a common gate amplifier (CG) with a common source amplifier (CS) for noise canceling (NC), as shown in Fig. 1.10, and achieves 2.98 dB NF. Nonetheless, this structure is prone to gain mismatch between the CG and CS stages. In [41], a shunt-feedback feedforward NC LNA is presented. As depicted in Fig. 1.11, the LNA utilizes a shunt-feedback main amplifier for wideband impedance matching and a common-source auxiliary amplifier that generates an in-phase signal and an out-of-phase noise with respect to that of the main amplifier for noise cancellation. The LNA achieves a NF of less than 1.9 dB and an  $S_{11}$  better than -15 dB from 30 MHz to 150 MHz with a power consumption of 18 mW.



Fig. 1.10. CG-CS NC LNA.



Fig. 1.11. Resistive shunt-feedback feedforward NC LNA.

To suppress the even-order harmonics of single-ended LNAs, several approaches have been proposed. Reference [43] proposes to use a bulky LC resonator as a second-order bandpass filter to reject the second-order harmonic. Reference [44] presented a Bipolar-CMOS (BiCMOS) LNA utilizing a constant  $g_m$  structure for the second-order intermodulation distortion (IM<sub>2</sub>) cancellation. The constant  $g_m$  structure, however, requires precisely matched emitters. Reference [42] proposes a simple yet robust complementary CMOS structure to cancel the even-order harmonics and is more suitable for CMOS LNA implementations.

To reduce the power consumption of NC LNAs, a step-up balun with two secondary turns to reduce the transconductance requirement for impedance matching is developed in [45]. Combining the balun with a current reuse structure, the LNA only consumes 475  $\mu$ W of power from a 0.7 V supply. The balun, however, is not suitable for wideband ultrasound receiver applications and occupies a large amount of die area. The current reuse structure with low-supply voltage [46], on the other hand, can be a more applicable and compact solution for high-frequency wideband ultrasound imaging applications.

#### **1.3 Main Contributions**

Three readout ASICs are presented in this dissertation. The first prototype is a low-power single-channel SiPM readout ASIC featuring an on-chip high-pass filterbased fast signal generator with a current-mode front-end and a customized SAR ADC [47]. The on-chip high-pass filter sharpens the rising edge of standard SiPM signal, facilitating fast timing measurement and reducing the noise-induced timing jitter while avoiding the area, cost, and power consumption penalties as compared to off-chip fast signal generator approach. Measurement results show that a 15 ps improvement in timing resolution is achieved by the on-chip HPF. Employing the SiPM charge integrator as the ADC track-and-hold sampler, the customized SAR ADC saves the power consumption by 9% and achieves an SNDR of 53.08 dB and an SFDR of 62.74 dB at 1 MS/s. Additionally, a current-mode readout front-end with a current-feedback low-input impedance current buffer is developed. The readout ASIC achieves a timing resolution of 151 ps, and a maximum gain nonlinearity of 3.3% over an input charge range of 800 pC, while dissipating 4.02 mW of power from a 1.8V supply.

The second prototype is a low-power highly-linear multi-channel data acquisition (DAQ) system for SiPM readout [48]. A low-input impedance current-mode front-end with a programmable current gain is developed to achieve high-precision charge readout over a large dynamic range. An integrated 10-bit successive-

approximation-register (SAR) analog-to-digital converter (ADC) shared by 16 readout channels in a time-multiplexed manner is designed to reduce the overall chip area and power consumption of the SiPM readout. Implementation challenges of fieldprogrammable gate array (FPGA)-based time-to-digital converters (TDCs) including bubble error, zero length bins, inter-clock region nonlinearity, and chain overflow, are addressed for high accuracy timing measurement. Multi-chain averaging is developed to improve the TDC linearity. Fabricated in a 180 nm CMOS process, the current-mode 16-channel readout application-specific integrated circuit (ASIC) achieves 3.6% maximum gain nonlinearity over an input charge range of 800 pC with dissipating 3.89 mW of power per readout channel, and the ADC achieves an SFDR of 58.34 dB and an SNDR of 51.37 dB at 16 MS/s. Implemented in a Xilinx 28 nm Kintex-7 FPGA, the 32channel TDC achieves a 15 ps root mean square (RMS) resolution, a differential nonlinearity (DNL) of less than 4 ps, and an integral nonlinearity (INL) of less than 10 ps.

The third prototype is a wideband low-noise amplifier (LNA) front-end [49] with noise and distortion cancellation for high-frequency ultrasound (HFUS) and photoacoustic (PA) imaging applications. The LNA employs a resistive shunt-feedback structure with a feedforward noise-canceling technique to accomplish both wideband impedance matching and low-noise performance. A complementary CMOS topology is also developed to cancel the second-order harmonic distortion, enhancing the amplifier linearity. A HFUS and PA imaging front-end including the proposed LNA and a variable gain amplifier (VGA) is designed and fabricated in a 180 nm CMOS process. At 80 MHz, the front-end achieves an input-referred voltage noise density of 1.36
nV/sqrt(Hz), an input return loss ( $S_{11}$ ) better than -16 dB, a voltage gain of 37 dB, and a total-harmonic distortion (THD) of -55 dBc while dissipating a power of 37 mW, leading to a noise efficiency factor (NEF) of 2.66.

#### **1.4 Dissertation Organization**

This dissertation is organized as follows:

Chapter II elaborates on the implementation of the low-power single-channel SiPM front-end with on-chip fast pulse generation and customized 10-bit SAR ADC. The architecture of the readout ASIC with working principle is firstly presented. Further, the detailed implementation of the proposed on-chip fast pulse generation, customized 10-bit SAR ADC, and other key building blocks are shown. The measured readout performance is given at the end of the chapter.

Chapter III presents a low-power and highly-linear DAQ system for SiPM readout featuring a 16-channel readout ASIC and a 32-channel FPGA-based TDC. The architecture of the DAQ system is introduced first. Then the design of the16 readout channels sharing with one single SAR ADC and the FPGA-based TDC are demonstrated. Finally, the measured results of the readout ASIC and the FPGA-based TDC are TDC are presented and compared with previously published works.

Chapter IV details the design and implementation of the wideband low-power low-noise front-end for high-frequency ultrasound imaging. Firstly, the concepts of shunt feedback LNA with feedforward noise canceling, complementary structure and current reuse technique are described. Then, the overall architecture of the front-end along with detailed implementations of the key building blocks are presented. Finally, measurement results are shown and compared with the state-of-the art publications.

Chapter V concludes this dissertation and talks about the future directions.

## **CHAPTER II**

# SINGLE-CHANNEL SIPM READOUT ASIC WITH ON-CHIP FAST PULSE GENERATION AND CUSTOMIZED SAR ADC

This chapter presents an on-chip fast signal generation approach and a customized successive-approximation-register (SAR) analog-to-digital converter (ADC) for silicon photomultiplier (SiPM) readout [47]. The fast-signal generator utilizes the equivalent resistance of the current mirror and the capacitor between the current mirror and the current discriminator to form a high-pass filter (HPF). It sharpens the rising edge of the standard SiPM signal, allowing fast timing measurement and reducing dark count noise-induced timing jitter. The on-chip approach also eliminates the packaging pins needed for carrying the fast signals, facilitating compact and lowcost readout solutions for SiPM array-based applications. The customized ADC reuses the SiPM charge integrator as the ADC track-and-hold (T/H) circuit to lower the ADC power consumption by 9%. A current-mode readout application-specific integrated circuit (ASIC) with the proposed fast signal generator and the customized ADC is designed in a 180 nm CMOS process. Measurement results show that the on-chip HPF provides a 15 ps improvement in timing resolution, and the readout ASIC achieves a gain non-linearity of less than 3.3% over 800 pC input charge range, a timing resolution of 151 ps, while dissipating 4.02 mW of power from a 1.8 V supply.

## 2.1 Top Level Architecture

Fig. 2.1 shows the block and timing diagrams of the SiPM readout ASIC with the proposed fast signal generation and the customized SAR ADC. The readout ASIC is mainly composed of a current-mode input stage, an on-chip high-pass filter (HPF) for





Fig. 2.1. (a) Simplified block diagram of the readout front-end with the on-chip fast signal generation and customized SAR ADC; and (b) its timing diagram.

fast signal generation, a charge integrator, a control logic with a current discriminator and a voltage comparator, and the customized SAR ADC. The input stage consists of a low-impedance current buffer with current feedback to provide a large input bandwidth and a multi-branch cascode current mirror with programmable currents to accommodate a large input dynamic range. The HPF (detailed in Fig. 2.2) is formed by the equivalent resistance of one branch of the current mirror transistor and the capacitor between the current-mirror branch and the current discriminator. The SAR ADC reuses the SiPM charge integrator as its T/H sampling circuit.

In the timing diagram, a standard SiPM current signal (blue) happens at time  $T_0$ . It is duplicated by the current buffer and sent to the HPF for fast signal generation. The generated fast signal (red) is then used by the current discriminator for event detection and  $T_1$  indicates the event arrival time. The output of the current discriminator generates a timing stamp pulse which is used for timing measurement and activates the switch  $S_1$ to start the charge integration process onto the integration capacitor  $C_Q$ . After completing the charge integration with a time period of  $\Delta T$ , the switch  $S_1$  is turned off. Meanwhile, the voltage comparator is enabled at time  $T_2$ . The voltage comparator compares the integrated voltage on  $C_Q$  with a predefined threshold voltage  $V_{TH}$  to decide whether the event is a true event or a false event caused by dark-count noise. If it is determined as a false event,  $C_Q$  will be discharged immediately. Otherwise, a ready flag signal at  $T_3$  is generated which triggers the switch  $S_2$  to pass the integrated voltage on  $C_Q$  to the customized SAR ADC to start the digitization process.

## 2.2 Circuit Design of Key Building Blocks

#### 2.2.1 Input Stage

SiPM sensor is consisted of thousands of parallel-connected microcells, the output impedance of SiPM detector is finite, while the output capacitance is typically several tens to hundreds pF [13, 14]. To prevent the loss of SiPM current pulse and most importantly to prevent further slowing down the rising edge of the pulse worsening the timing measurement, a low input impedance buffer is utterly essential for the front-end design. Comparing with voltage-mode buffers, current-mode circuits can provide lower input impedance and higher bandwidth [15].

In Fig. 2.2, the schematics of the current buffer with HPF is depicted. By inserting a common gate NMOS between the input node and the current mirror, and



Fig. 2.2. Input current buffer with on-chip HPF.

adopting current feedback [50] through  $M_{P2}$ , the input impedance  $(1/g_{m, N1})$  of the current buffer can be scaled down by a factor of *N*. In the design, *N* is selected as 40, and helps to achieve an impedance reduction of 40 times. As shown in Fig. 2.3, with the current feedback the input impedance  $Z_{IN}$  is reduced from 913  $\Omega$  to 23  $\Omega$ . The  $f_{-3 \text{ dB}}$  bandwidth of the input stage is improved from 2.9 MHz to 115.4 MHz when having an input capacitance of 60 pF from the SiPM device [13].

With a 1:1 scaling factor, a copy of the input signal is then outputted to the onchip HPF to generate the fast signal pulse  $I_{OUT1}$ . Depending on the amount of the input charge, a scaled current pulse  $I_{OUT2}$  will be selected from branch  $K_1$  or  $K_2$  for charge integration once the current discriminator is triggered. In the design, the target input charge range is from 20 pC to 800 pC, and  $K_1 = N/50$  for 20 pC to 200 pC range, and  $K_2 = N/200$  for 200 pC to 800 pC range. It should be noted that both the bandwidths of



Fig. 2.3. Simulated input impedance reduction and bandwidth improvement of the input stage.

the interface between SiPM and current buffer and the output node of the current buffer should be high enough to prevent signal loss. In this design, the cutoff frequencies at the two nodes are set as > 10 MHz to suit for PET applications, and the corresponding AC responses are also depicted in Fig. 2.3.

## 2.2.2 On-Chip HPF

The fast signal is generated by the on-chip high-pass filter implemented by utilizing the equivalent resistance of one current mirror branch (: N) and the capacitor between the mirror branch and the current discriminator, as shown in Fig. 2.2. As shown in Fig. 2.4, the current mirror branch (: N) has an averaged equivalent resistance dV/dI



Fig. 2.4. The simulated equivalent impedance of the on-chip HPF.

of 8.2 k $\Omega$  during the triggering process. The capacitor between the current mirror branch and the current discriminator is implemented with a 3-bit binary-weighted Metal-Insulator-Metal (MIM) capacitor bank. The LSB capacitance of the capacitor bank is 1 pF, and accordingly the capacitor bank has a maximum capacitance of 7 pF occupying a total area of 29.6  $\mu$ m × 296  $\mu$ m. Accordingly, the cutoff frequency of the HPF is programmable from 2.8 MHz to 19.4 MHz.

While a smaller capacitance can further enlarge the cutoff frequency of the HPF, yet it can cause the generated fast signal to have a smaller amplitude, demanding a higher sensitivity of the current discriminator. Besides, while a common-gate current buffer could be applied to provide more flexibility in the resistor design of the HPF, however, it comes with extra power consumption. The proposed HPF design provides a good tradeoff between the timing performance improvement and the power consumption of the readout ASIC.

#### 2.2.3 Current Discriminator

A current discriminator with enhanced stability [22] is developed, as shown in Fig. 2.5. When the input current signal from the HPF exceeds the current threshold  $I_{TH}$ , a positive net current is injected into node *X*. Due to the positive feedback introduced by the latch implemented by the cross-coupled CMOS transistors, node *Y* will be pulled down to the ground level. Consequently, the output of the discriminator will be flipped. To improve the stability of the current discriminator, a hysteresis current  $I_{HYS}$  is introduced. When the discriminator output is pulled up to the power supply level, the NMOS switch  $M_{NS}$  will be turned on to source the hysteresis current  $I_{HYS}$ , forming positive feedback to improve the transient stability. With a similar analysis, as the input



Fig. 2.5. Current discriminator with enhanced stability.

current falls below the threshold value, a negative net current is injected into the circuit, discharging node *X* to the ground level. Consequently, the discriminator output becomes low and turns off the NMOS switch  $M_{\rm NS}$ . The  $I_{\rm TH}$  is implemented with a 3-bit binary weighted current bank with the unit current being 4  $\mu$ A and the corresponding transistor size is 1  $\mu$ m / 0.5  $\mu$ m. The value of  $I_{\rm HYS}$  is half of the threshold current  $I_{\rm TH}$ .

#### 2.2.4 Detection and Integration Control Logic

A specific control logic is designed to control the integration process and judge the triggering event as true photon events or dark-count events, and the simplified block diagram of the control logic is shown in Fig. 2.6.

Once the current discriminator detects the SiPM current pulse and makes its output ( $CD_{OUT}$ ) flips from ground to  $V_{DD}$ , a rising-edge triggering DFF will generate a enable signal ( $EN_{INT}$ ) to start the charge integration process, which also triggers the



Fig. 2.6. Simplified block diagram of event detection and integration control logic.

generation of the timing stamp pulse. To control the charge integration time period, a programmable integration time control is implemented. In idle sate, the capacitor  $C_{\rm ITC}$ will be charged to  $V_{\rm DD}$  through current source  $I_{\rm CH}$ , and the output  $INT_{\rm RST}$  will be ground. When the enable signal  $EN_{\rm INT}$  changes to high to start the integration process,  $C_{\rm ITC}$  will be discharged by tunable current source  $I_{\rm DIS}$ . Once the decreasing voltage on  $C_{\rm ITC}$  is lower than the tripping point ( $V_{\rm DD}/2$ ) of inverter, the output  $INT_{\rm RST}$  will turn to  $V_{\rm DD}$  to reset the enable signal  $EN_{\rm INT}$  to stop the integration process, thus the discharging time of  $C_{\rm ITC}$  decides the integration time of SiPM charges. To accommodate different widths of SiPM signals, the amount of the charge integration time is set by a programmable integration time control (ITC) circuit, where the charge integration time is

$$T_{\rm INT} = (C_{\rm ITC} \times V_{\rm DD}/2) / I_{\rm DIS}, \qquad (2.1)$$

where  $I_{\text{DIS}}$  is tunable from 2  $\mu$ A to 10  $\mu$ A,  $C_{\text{TTC}}$  is designed as 3.2 pF, and  $V_{\text{DD}}/2$  is 0.9 V. The integration time  $T_{\text{INT}}$  is thus tunable from 288 ns to 1.44  $\mu$ s.

With the falling edge of  $EN_{INT}$ , the voltage comparator will be enabled by  $CMP_{RDY}$  to start the comparison between the integrated voltage  $V_{INT}$  and the pre-set voltage threshold  $V_{TH}$ . If  $V_{INT}$  is lower than  $V_{TH}$ , the detected event will be judged as a false event and the integration capacitor  $C_Q$  will be discharged immediately (for simplicity, this part is not shown in Fig. 2.6). Otherwise, the ready flag signal ( $EN_{SAR}$ ), being synchronized with the SAR clock, will enable the SAR ADC to start the digitization of  $V_{INT}$ , and reset the corresponding DFF to flip down the timing stamp pulse to enable the off-chip FPGA-based TDC to start the timing measurement of pulse arriving time. The adopted voltage comparator shares the same design as in the SAR ADC and is detailed in section 2.3.4.

### 2.3 Customized SAR ADC for SiPM Readout

#### 2.3.1 Introduction of SAR ADC

Benefiting from its simple architecture, successive-approximation-register (SAR) analog-to-digital converters (ADCs) feature with medium resolution (8-12 bits), moderately high sampling rate (up to 100 MS/s) and relatively low power consumption. As no op-amp needed for residue amplification, SAR ADCs are mainly consisted of digital circuits, which make SAR ADCs advantageous in CMOS technology scaling. However, due to its binary-search algorithm, single channel SAR ADCs are hard to achieve sample rates > 500 MS/s with good figure of merits (FOM). Given these



Fig. 2.7. Basic architecture of an N-bit SAR ADC.

properties, SAR ADCs are often the best choice for battery-powered mobile applications which need only medium resolution (8-12 bits) and medium speed (1-100 MS/s) but require low-power consumption and small interleaving factors. SAR ADCs are widely employed in low energy radios (Bluetooth for body-area networks), in autonomous portable sensor systems, in many biomedical applications [51, 52], such as electrocardiogram (ECG) and electro-encephalogram (EEG), and pacemaker. Fig. 2.7 shows the basic architecture of an *N*-bit SAR ADC, which is mainly consisted of 4 parts, a sample and hold (S/H) circuit, a voltage comparator, an *N*-bit capacitive digital to analog converter (CDAC), and the digital control logic block. (The S/H, comparator and CDAC are differential structure and are depicted in single-ended for simplicity.)

The working principle of conventional SAR ADC is depicted in Fig. 2.8. Firstly, the S/H circuit will sample and hold the input analog signal,  $V_{IN}$ , to the comparator input node. In the meantime, to implement the binary search process, the *N*-bit CDAC output

 $(V_{\text{CDAC}})$  is pre-set as half of the reference voltage  $(V_{\text{REF}})$  by setting an *N*-bit code register in SAR control logic block to be midscale, *N*'b100...00, where MSB is logic 1, as the startup state. Then the comparator performs the comparison between its inputs,  $V_{\text{INP}}$  and  $V_{\text{INN}}$ . If  $V_{\text{INP}}$  is greater than  $V_{\text{INN}}$ , the comparator output is a logic high and the MSB of the *N*-bit code register remains at logic 1; if  $V_{\text{INN}}$  is less than  $V_{\text{INP}}$ , the comparator output is a logic low and the MSB of the register cleared to logic 0. Based on the updated state of the register, the SAR logic then moves to the next bit cycle, forces that bit high, and does another comparison. The sequence continues all the way down to LSB. Once the



Fig. 2.8. Flow graph of conventional SAR ADC.



Fig. 2.9. Proposed customized 10-bit SAR ADC for SiPM readout.

*N* times of comparisons are done, the input analog signal is then converted into an *N*-bit digital word and available in the code register.

#### 2.3.2 Proposed Customized SAR ADC

Fig. 2.9 shows the architecture of the 10-bit customized SAR ADC, which consists of a voltage comparator, a capacitive-DAC (CDAC), and a synchronous SAR control logic. The ADC is designed to directly utilize the SiPM charge integrator as its T/H sampler, eliminating the need for a dedicated T/H circuit. This avoids the T/H-induced kT/C thermal noise and nonlinear distortion, helping to improve the ADC effective resolution and also bring a ~9% power consumption reduction.

As shown in Fig. 2.9, the integrated voltage  $V_{INT}$  is applied to the positive input of the comparator, while the CDAC voltage  $V_{CDAC}$  is applied to the negative input node.  $V_{CDAC}$  will be gradually approaching  $V_{INT}$  through the 10-bit binary search cycles. The conversion mechanism of the SAR ADC is depicted in Fig. 2.10. As  $V_{INT}$  can vary with different photon events, a reset phase is thus implemented to reset the comparator inputs to  $V_{CM}$  before the start of the quantization. The reset phase is conducted in parallel with the charge integration process. Thus, it will not occupy the SAR-conversion timing budget.

In conventional SAR ADC design, the  $V_{\text{CDAC}}$  (the output voltage of the CDAC) and the  $V_{\text{INT}}$  (the integrated voltage from  $C_{\text{Q}}$ ) should approach to the common mode of the input signal until the difference is less than one LSB at the end of the conversion. Thus, only the sampling phase and switching phase are needed in SAR ADC conversions. However, due to the single-ended structure, only  $V_{\text{CDAC}}$  can be switched in this design, and  $V_{\text{CDAC}}$  should gradually approach to  $V_{\text{INT}}$  over the switching phase. Therefore, both  $V_{\text{CDAC}}$  and  $V_{\text{INT}}$  will equal to the integrated voltage of the present channel at the end of the conversion. To mitigate the residual influence between



Fig. 2.10. Conversion mechanism of the customized SAR ADC.

measurements, an extra reset phase is needed before the sampling phase for  $V_{INT}$ . Noticing that  $V_{CDAC}$  will not be connected to  $C_Q$  for sampling, the sampling phase could also be used as the reset time for  $V_{CDAC}$  to relax the settling requirement for resetting the CDAC.

#### 2.3.3 CDAC

Nonlinearity of CDAC directly affects the Spurious-free dynamic range (SFDR) of SAR ADC, and mainly arises from three sources: capacitor mismatch (gradients and random variations), capacitor nonlinearity (capacitor voltage dependence), and the nonlinearity of the junction capacitance of a switch connected to a capacitor. Random variations in the capacitor array happen due to process-dependent dimension (W, L) and oxide thickness ( $t_{ox}$ ) mismatch. It should be noted that integral linearity improves as the number of capacitors in the array increases because random errors tend to average out.



Fig. 2.11. Monte Carlo Simulation results for the unit MIM capacitor of CDAC.

To improve the matching between large capacitors many techniques have been developed, including common-centroid geometries (layout), multi-stage capacitor network (to reduce array size).

The unit capacitor size is chosen to meet linearity specification. The expected worst case linearity error occurs at the MSB transition, and the unit capacitance of a *B*-bit single-end SAR ADC should meet the following requirement [53] to maintain the error less than half LSB in 3-sigma situation:

$$3 \times \frac{\sqrt{2^{\mathrm{B}} \times \delta^2}}{C_{\mathrm{unit}}} < \frac{1}{2}, \tag{2.2}$$

where  $C_{\text{unit}}$  is the unit capacitance of CDAC, and  $\delta$  is standard deviation of  $C_{\text{unit}}$ .

For the proposed 10-bit single-ended SAR ADC, the required  $\delta/C_{unit}$  should be less than 0.52%. In the adopted TSMC 180 nm PDK, the minimum size of applicable Metal-insulator-metal (MIM) capacitor is 4 µm × 4 µm (33.2 fF), with a  $\delta/C_{unit}$  of 0.22%, as shown in Fig. 2.11, which is much better than the required 0.52%, thus can meet the 10-bit linearity specification.

#### 2.3.4 Two-Stage Dynamic Comparator

Strong-Arm based dynamic comparators are widely used in low-power ADC designs [54]. Fig. 2.12 illustrates a simple example of the Strong-Arm based dynamic comparator. In reset phase, when clock is low, reset transistors ( $M_7$ ,  $M_8$ ) are on, and output nodes are charged to supply voltage. Tail transistor  $M_9$  is off, thus no supply current flows through the differential amplifier during the reset phase. In regeneration phase, when clock is high, reset transistors are off, and the tail transistor is turned on.



Fig. 2.12. Strong-Arm based dynamic comparator.

The cross-back inverters receive different amount of current dependent on the input voltage and start to re-generate output, and the drain voltages of input transistors ( $M_1$  and  $M_2$ ) are discharged (from  $V_{DD}$  to GND) with different slew rates depending on their gate voltages. Once the drain voltage (either of  $M_1$  or  $M_2$ ) drops below ( $V_{DD} - V_{TH}$ ), NMOS of the corresponding inverter is turned on and the appropriate output node starts to discharge. Since the output node is shorted to the input of another inverter, PMOS of this inverter is turned on. Consequently, the output nodes are regenerated with one node  $V_{DD}$  and the other GND. By the end of regeneration phase both drains of the differential pair approach to 0 V potential and transistors are in triode state. With no static power consumption, the dynamic structure presents a good low-power benefit.

However, disadvantages of the structure can never be ignored. The large kickback noise caused by the rail-to-rail drain voltage variation of input pair and the



Fig. 2.13. Two-stage dynamic latched comparator.

inverter heavily affects the noise performance of the comparator. Another drawback of the architecture is related to the fact that it has one tail transistor. Size of the tail transistor limits the total current through the differential input pair; thus, it should be increased to enhance the comparator speed. However, dependent on a common-mode voltage, it decreases the time when input transistors are in saturation which makes comparator gain lower and, in turn, input-referred offset more significant. In other words, the speed and offset of this design are very dependent on common mode input voltage. Headroom consumption due to the 4 stacks transistors is another issue for low supply voltage designs.

To mitigate the above issues, a two-stage dynamic latched comparator is adopted in the design, as shown in Fig. 2.13. As the name implies, the comparator consists of two stages, input-gain stage, and output-latch stage. The two-stage comparator has a similar working principle to the Strong-Arm comparator. The latch stage following the input stage senses the voltage difference between node X and node Y during the discharging process, and non-linearly amplifies the difference to make the decision.

This two-stage separation made the comparator to have a lower and more stable offset voltage over a wide range of input common-mode voltage  $V_{CM}$ . Besides, the two-stage structure provides a better headroom requirement and allows to operate with a lower supply voltage to further reduce power consumption. Kickback noise also exists in the two-stage structure and can be suppressed by increasing the size of input pair and increasing the gain of input stage.

To achieve a readout linearity of less than 4%, the maximum charge integration voltage is designed as 500 mV, ranging from 0.75 V to 1.25 V. This ensures the linearity of the current mirror, the dominant source of the nonlinearity of the front end. With a 500 mV maximum charge integration voltage, the least-significant bit (LSB) of the 10-bit SAR ADC is 0.49 mV, requiring both the comparator noise and the comparator offset voltage variation to be less than 0.245 mV (or 0.5 LSB). The two-stage dynamic comparator is developed to achieve this desired noise and input-referred offset levels. With the size of the preamp input transistors being 20  $\mu$ m / 0.18  $\mu$ m, simulation results

Table 2.1. Comparator Noise and Offset Performance

| V <sub>INT</sub>     | @ 0.75 V | @ 1.25 V |          |  |  |
|----------------------|----------|----------|----------|--|--|
| Noise                | 0.43 LSB | 0.23 LSB | 0.21 LSB |  |  |
| Max Offset Variation | 0.35 LSB |          |          |  |  |
| Power                | 52 μW    | 57 μW    | 60 µW    |  |  |

show that the comparator noises are 0.21 mV, 0.11 mV, and 0.1 mV, and the comparator offset voltages are 8.13 mV, 7.96 mV, 7.91 mV for the input voltages being 0.75 V, 1.0 V, 1.25 V, respectively. Thus, the maximum comparator noise is 0.43 LSB and the maximum comparator offset voltage variation is 0.35 LSB, as summarized in Table 2.1.

#### 2.3.5 Synchronous SAR Logic

Fig. 2.14 shows the adopted 10-bit synchronous SAR logic. The upper flip-flop (FF) row is a shift register and called a sequencer in this architecture. The lower flip-flop row is called a code register. Before conversion begins the sequencer is reset to value 10'b10\_0000\_0000 on a reset signal. Next 10 clock cycles the sequencer shift 1s through the register. An output from the sequencer sets an FF in the code register through its "set" input. The output from the FF that is being set is used as a clock for the previous FF in the code register. At the rising edge of the clock the FF loads a data signal CMP, which is a comparator result. Output of the last flip-flop on the sequencer



Fig. 2.14. Synchronous SAR logic.



Fig. 2.15. PCB and die photo of the SiPM readout ASIC.

is the flag signal standing for the end of conversion and could be used to designate the valid digital code of the ADC.

#### 2.4 Measurement Results

The fast signal generator, the customized ADC, and the current-mode readout front-end is designed in a one-poly six-metal (1P6M) 180 nm bulk CMOS process. The core areas of the front-end and the SAR ADC are 380  $\mu$ m × 350  $\mu$ m and 470  $\mu$ m × 340  $\mu$ m respectively, as shown in Fig. 2.15.

The SAR ADC performance is characterized by connecting an external signal source directly to the input of the ADC. The measured static performance of the SAR ADC is shown in Fig. 2.16. The differential nonlinearity (DNL) and integral nonlinearity (INL) are -0.37/+0.41 LSB and -0.96/+0.99 LSB, respectively. Clocked at 1 MS/s, the SAR ADC consumes 147  $\mu$ W of power from a 1.8 V supply. The output



Fig. 2.16. Measured DNL and INL of the SAR ADC.

spectrum of the ADC with a near-Nyquist input is shown in Fig. 2.17, where an SNDR of 53.08 dB and an SFDR of 62.74 dB are achieved. The ADC dynamic performances are plotted in Figs. 2.18 and 2.19, respectively. The SNDR stays above 50 dB with input frequency up to 1 MHz @ 1 MS/s and maintains higher than 51 dB up to 8 MS/s. Table 2.2 compares the ADC with recently published ADCs for SiPM readout applications. It should be mentioned that although this work focuses on reusing the SiPM charge integrator as the ADC T/H sampler to reduce the power consumption, the ADC,



Fig. 2.17. Measured ADC output spectrum at 1 MS/s with a near-Nyquist input signal.



Fig. 2.18. Measured SNDR and SFDR versus input frequency at 1 MS/s sampling rate.

however, can be designed to achieve a much higher sampling rate, thus potentially allowing it to be shared by multiple SiPM readout channels in a time-multiplexed manner to further reduce the overall power consumption.



Fig. 2.19. Measured SNDR and SFDR versus sampling frequency at Nyquist rate.

|                      | This Work <sup>a</sup> | [55] <sup>b</sup> | [56] <sup>a</sup> | [57] <sup>b</sup> |
|----------------------|------------------------|-------------------|-------------------|-------------------|
| Technology [nm]      | 180                    | 350               | 350               | 350               |
| Power Supply [V]     | 1.8                    | 3.3               | 3.3               | 3.3               |
| Resolution [bit]     | 10                     | 9                 | 12                | 12                |
| Sampling Rate [MS/s] | 1                      | 4                 | 0.75              | 2.5               |
| Power [mW]           | 0.147                  | 0.95              | 0.5               | 3                 |
| DNL [LSB]            | -0.37/+0.41            | NA                | ±-0.2             | NA                |
| INL [LSB]            | -0.96/+0.99            | ±1.6              | ±1                | NA                |
| ENOB [bit]           | 8.52                   | NA                | NA                | NA                |

Table 2.2. ADC Performance Comparison for SiPM Readout Applications

<sup>a</sup> Measurement results; <sup>b</sup> Simulation results.

Fig. 2.20 shows the measurement setup for SiPM readout. The ultrashort laser pulser, Hamamatsu PLP-10, is used as the light source. It periodically feeds laser pulses



Fig. 2.20. Measurement setup (A: Digital Signal Analyzer; B: Clock Source; C: Laser Pulser; D: Dark Box with SensL SiPM evaluation board MICROFC-SMA-30035-GEVB; E: Power Supply; F: Readout ASIC).

with a 60 ps pulse width to the SensL SiPM evaluation board MICROFC-SMA-30035-GEVB, which is placed in the dark box. The Tektronix digital signal analyzer DSA73304D, which has a high timing resolution of 1 ps, is used to capture the rising edges of the timing stamp pulses. The ultrashort laser pulse (yellow) and the corresponding standard SiPM output pulse (blue) are shown in Fig. 2.21. By analyzing the rising edges of the timing stamp pulses, the timing resolution of the readout ASIC is obtained. Fig. 2.22 shows the measured timing resolution with different capacitance settings of the HPF corresponding to a 300 pC SiPM pulse. The measured timing resolutions are 151 ps, 156 ps, and 166 ps with the HPF capacitances being 1 pF, 4 pF and 7 pF, corresponding to the cutoff frequency of the HPF being 19.4 MHz, 4.9 MHz, and 2.8 MHz, respectively. The best timing performance is 151 ps and is achieved under the 19.4 MHz HPF cut-off frequency setting. Compared with current discrimination using the standard SiPM signal (or equivalently the capacitance is infinite in the HPF), the fast signal improves the timing resolution by at least 15 ps.

The linearity of the energy measurement is also conducted. The measured ADC output codes corresponding to different input charge levels are plotted in Fig. 2.23 and a gain non-linearity of less than 3.3% is achieved over the full input charge range. Table 2.3 summarizes the performance of the proposed on-chip fast signal generator, the



Fig. 2.21. Measured laser pulse and the corresponding standard SiPM output current signal.

customized SAR ADC, and readout ASIC. The performance comparison of the designed SiPM readout ASIC with lately published SiPM readout ASICs is shown in Table 2.4.





Fig. 2.22. Measured timing resolution of the readout ASIC with the HPF capacitance settings being (a) 1 pF; (b) 4 pF; and (c) 7 pF.



Fig. 2.23. Measured ADC output codes versus different input charges.

| Parameters                               | Results              |  |
|------------------------------------------|----------------------|--|
| Technology                               | 180 nm               |  |
| Current-mode Front-end Power Consumption | 3.7 mW               |  |
| ADC Power Consumption                    | 147 μW               |  |
| ADC Resolution                           | 10 bits              |  |
| ADC Sampling Rate                        | 1 MS/s               |  |
| ADC DNL                                  | -0.37/+0.41 LSB      |  |
| ADC INL                                  | -0.96/+0.99 LSB      |  |
| ADC SFDR @ 1 MS/s                        | 62.74 dB             |  |
| ADC SNDR @ 1 MS/s                        | 53.08 dB             |  |
| ADC ENOB @ 1 MS/s                        | 8.52 bits            |  |
| Overall Core Area                        | $0.293 \text{ mm}^2$ |  |
| <b>Overall Power Consumption</b>         | 4.02 mW              |  |
| Conversion Period                        | 1 µs                 |  |
| Timing Resolution                        | 151 ps               |  |
| Max Gain Nonlinearity                    | 3.3%                 |  |

 Table 2.3. Single-channel SiPM Readout Performance Summary

Table 2.4. Performance Comparison of the SiPM Readout

|                            | This Work <sup>*</sup> | [15] <sup>*, a</sup> | [21]* | [22] <sup>*, a</sup> | [58]** | [59]** |
|----------------------------|------------------------|----------------------|-------|----------------------|--------|--------|
| Technology [nm]            | 180                    | 350                  | 150   | 180                  | 350    | 130    |
| Power [mW/ch]              | 4.02                   | 10 <sup>b</sup>      | 15    | 3 <sup>b</sup>       | 18.75  | 11     |
| Max Gain Non-<br>linearity | 3.3%                   | N/A                  | N/A   | N/A                  | 8%     | 5%     |
| Timing<br>Resolution [ps]  | 151                    | 1000                 | 660   | 363                  | N/A    | N/A    |

\* Measured results; \*\* Simulated results.

<sup>a.</sup> With off-chip ADC and TDC; <sup>b.</sup> Not including ADC and CSA power consumption.

## **CHAPTER III**

## DAQ WITH MULTI-CHANNEL SIPM READOUT ASIC AND HIGHLY-LINEAR FPGA-BASED TDC

Highly-linear and low-power multi-channel silicon photomultiplier (SiPM) readouts are required for data acquisition (DAQ) systems [48]. In Chapter II, a dedicated SAR ADC is designed for the single-channel SiPM readout to conduct energy measurement is presented [47]. However, the large size of the SAR ADC, mainly due to the capacitive DAC (CDAC), highly increases the chip area and limits its application in multi-channel SiPM readouts. Therefore, a single-channel high-speed SAR ADC that can be shared with multiple channels of SiPM readout to maintain the per-channel conversion rate and at the same time offer a low overall area and power consumption is highly desired.

On the other hand, SiPM readout systems typically rely on time-to-digital converters (TDCs) for the precise timing measurement of the triggering timing stamps. Early TDC development requires the use of application-specific integrated circuits (ASICs) to deliver a fine resolution. The ASIC based TDCs are still in development today achieving 0.5~30 ps resolution with both digital and analog methods, respectively [60, 61]. The recent advancements in field-programmable gate array (FPGA), however, have continued to shorten the performance gap between the ASIC and FPGA-based TDCs, while providing lower cost and shorter development time. Many methods have been investigated, including Vernier tapped delay line (TDL) [62, 63], pure carry chain

TDL [64, 65, 66], matrix of counters [67], wave union [68, 69], and ring oscillators [70, 71]. Each method has its own advantage and disadvantage regarding resolution, linearity, speed, dead-time, complexity, fabric slice count, and power utilization. Although FPGA is limited to a predefined fabric structure, the use of linearization improvement techniques such as wave union and multi-chain averaging are more easily implemented benefiting from the abundance of carry chains and digital signal processing blocks. Common implementation challenges in FPGA-based TDCs include bubble error within the thermometer-to-binary encoder, inter-clock region nonlinearity and calibration [62]. Modern FPGAs in advanced CMOS technology nodes also exhibit new challenges due to advancements in fabric speed which shorten the difference between gate delay and path delay causing clock skews [72].

This chapter presents a low-power highly-linear multi-channel DAQ system for SiPM detectors. For energy measurement, a low input impedance current-mode 16-



Fig. 3.1. Top-level architecture of the proposed multi-channel SiPM DAQ system.

channel readout ASIC [48] with an on-chip SAR ADC that is shared among all the readout channels is developed in a 180 nm CMOS technology, achieving a maximum gain nonlinearity of 3.6% over a charge range up to 800 pC, while only dissipating 3.89 mW of power per channel. For timing measurement, a highly-linear 32-channel, multi-chain averaged TDC using Xilinx Kintex-7 FPGA is developed, achieving 15 ps RMS resolution, 11 ps average bin size, and less than 10 ps integral nonlinearity.

#### **3.1 Top Level Architecture**

Fig. 3.1 shows the top-level diagram of the proposed DAQ, which includes the 16-channel SiPM readout ASIC for energy measurement and the Xilinx Kintex-7 FPGA-based TDC for timing measurement.

In each channel, the current-mode input stage, presented in Section 2.2.1, is utilized for the charge readout, which consists of a current buffer with current feedback, a multi-branch cascode current mirror with programmable currents, a charge integrator, a current discriminator and a voltage comparator for event detection, and a control logic unit. The current-mode front-end [47] allows SiPM readout over a large dynamic charge range and also preserves the slope of the rising edge of SiPM signal. As shown in Fig. 2.2, a replicated SiPM pulse by the current mirror is sent to the current discriminator for event detection. If the current pulse exceeds a predefined current threshold  $I_{TH}$ , a timing pulse is generated by the control logic unit enables the charge integration of a scaled SiPM current to the charge integration capacitor  $C_Q$ . After a pre-defined charge integration period which is programmable from 288 ns to 1.44 µs, the integrated voltage

is compared with a voltage threshold  $V_{\text{TH}}$  to determine if the event is triggered by a true photon event or dark-count noise. If it is a true event, a ready flag signal is generated and sent to the selection logic unit. The 16 SiPM readout channels share a single 10-bit 16 MS/s SAR ADC for energy digitization. The selection logic unit sequentially checks the status of the ready flag signals of the 16 channels. If the ready flag signal of the channel being checked is active, then the switch between that channel and the ADC will be turned on, passing the charge stored on the charge integration capacitor to the SAR ADC for energy digitization. After energy digitization, the ADC output and the channel position bits are serialized and sent to the energy register in the FPGA. When the channel is selected, the timing pulse of the channel is then ended by the channel selection logic. By digitizing the pulse length using the FPGA-based TDC, the event



Fig. 3.2. Timing diagram of the proposed 16-channel SiPM readout ASIC.

timing can be obtained and stored in the timing register in the FPGA. Using the data stored in the timing and energy registers, coincidence processing can be further conducted.

## 3.2 Key Building Blocks of the Readout ASIC

## 3.2.1 Input Stage

The input stage of each readout channel utilizes the same design as in Section 2.2.1, thus has the benefits of low input impedance and wide dynamic range for input signal. It should be noted that the cascode structure is adopted for the current mirror, as



Fig. 3.3. Simulated DNL and INL comparison of the current buffer w/o and w/ cascode current mirror structure in 200 pC input charge range.
shown in Fig. 2.2, to maintain good linearity for charge integration. Fig. 3.3 compares the linearity of the current-mode front-end with and without the cascode structure. With cascode structure, the differential nonlinearity (DNL) is improved from 2.4% to 0.7%, and the integral nonlinearity (INL) is improved from 5.8% to 1.3%, respectively.

#### 3.2.2 SAR ADC and Serializer

To address the large chip area and high-power consumption problem in designing multi-channel SiPM readout ASICs, a shared-ADC approach is developed. As shown in Fig. 3.4, a single 16 MS/s 10-bit SAR ADC is shared by the 16 front-ends in a time-multiplexed manner. This allows each channel to have an equivalent conversion rate of 1 MS/s (or 1  $\mu$ s equivalent conversion time per channel) suitable for PET imaging and many physics experiment applications [21, 56, 57], while at the same time substantially reducing the overall chip area and lowering the chip power consumption as compared to the conventional approach where a dedicated ADC is employed in each readout channel [21, 73].



Fig. 3.4. Serialization of ADC output and channel position bits.

The area of a SAR ADC is dominated by the capacitive DAC (CDAC). The size of the CDAC is determined by the ADC resolution and the capacitor mismatch errors of the technology. Thus the 16 MS/s 10-bit SAR ADC has a similar die area as compared to that of the 1 MS/s 10-bit SAR ADC. This allows the shared-ADC to greatly reduce the overall chip area. Also, benefiting from the reduced capacitive load from the clock and signal paths, the equivalent power consumption (dominated by the dynamic power consumption of  $C \times V_{DD}^2 \times$  frequency) per readout channel is also reduced.

To manage the shared-ADC operation sequentially, a channel selection circuit is developed. The channel selection circuit sequentially checks the charge integration status of the 16 channels with an interval of 62.5 ns. If the charge integration process of the channel being checked is fully completed, the channel selection circuit will turn on the switch between the channel and the shared-SAR ADC as shown in Fig. 3.4, passing the charge stored on the charge integration capacitor to the ADC for energy digitization. As the energy digitization happens only when the charge integration process is fully completed, the voltage across the integration capacitor is constant. As a result, the data transfer of such a DC value to the ADC by the connecting switch essentially takes no time. The channel selection circuit is implemented using shift registers and incurs negligible area and power consumption overhead.

To match the ADC output and the corresponding channel number, a 4-bit channel position register (where 0000 represents channel #1, 0001 represents channel #2, and 1111 represents channel #16) and a 2-bit frame header are designed. As depicted in Fig. 3.4, the 10-bit ADC output along with the 4-bit channel position code and the 2-bit frame header is serialized with an on-chip 16:1 serializer implemented in static

CMOS logic. The serializer is synchronized with the ADC at 256 Mb/s. The serialized data is finally sent to the energy registers in the FPGA.

# **3.3 FPGA-based TDC and Linearity Improvement**

FPGA-based TDC offers design-time minimization and revision flexibility. For particle physics experiments, FPGA-based TDCs often meet or exceed the relatively relaxed resolution requirement and can meet the linearity requirement if care is taken to remove, cancel, or average out the imperfect qualities of FPGA fabric. TDC linearity is a primary concern for particle physics time-of-flight (TOF) and time-over-threshold



Fig. 3.5. Initial TDC design with 400 MHz coarse clock, single register, and latch stage.

(TOT) measurements. Any nonlinearity could skew results causing timing measurement errors. For this reason, an initial TDC design using the Xilinx 28 nm Kintex-7 FPGA is developed to identify each source of nonlinearity. Fig. 3.5 details the initial TDC design with a "pure carry" tapped delay line and a single register-and-latch stage. The "pure carry" TDL is adopted due to its simplicity and improved linearity versus the Vernier TDL. For implementing the TDL, Xilinx offers a "Carry 4" look-up table (LUT) primitive which is portable across all Xilinx families and is typically utilized for subcycle arithmetic. As a pulse signal feeds in, the on-board clock acts as a stop pulse. A 400 MHz clock is adopted for state-machine operation and TDC coarse clock counting of unit increments (2.5 ns) between pulses. A sync signal is employed for resetting the coarse counter. Fig. 3.6 represents the resulting code-density test results with bin (or



Fig. 3.6. Initial TDC code density results with independent components of nonlinearity.

tapped delay) number on the *X* axis and bin count (or delay length) on the *Y* axis. A code-density test feeds an independent clock into the pulse input. Each bin should theoretically fill up equally. Any difference from equal bins indicates nonlinearity. We noticed four primary sources of nonlinearity: detector design which causes zero-length bins at the front end, clock skew which causes evenly distributed zero bins throughout, encoder error from thermometer to binary conversion, and clock region crossing error when the tapped delay line crosses a clock region boundary with a large additional delay versus a typical tapped delay.



Fig. 3.7. Synchronized-enable pulse detector.



Fig. 3.8. (a) Xilinx representation of global clock path to each register clock input within SLICEL; (b) Average path length from global clock to register clock inputs (post-layout simulation).

Each of the identified components of nonlinearity in Fig. 3.6 are improved with specific adjustments to the TDC design. Zero-length front-end bins are caused by a lagging enable signal provided to the latching stage of registers. This is improved by shifting the enable or "valid" signal forward in relation to the input pulse. Fig. 3.7 details a synchronized-enable pulse detector with "valid" signal aligned with the pulse [74]. A measurement of the average bin duration indicates 11 ps. Each clock region has a maximum tapped delay count of 200. The coarse clock can be increased to 500 MHz to avoid crossing the clock region boundary and still have a healthy 10% margin of additional bins for process, voltage, and temperature drifts.

Clock skew and bubble error cause zero length bins to be spread over the full TDL bin window often resulting in an odd, even effect with every other bin at zero length. Clock skew is a common phenomenon in < 40 nm fabric technology due to smaller size delay cells becoming closer in duration (< 40 ps) to the path delay (< 30 ps) from global clock to register clock input [72]. Registers (or bins) that receive the latching clock (or TDC stop signal) early may latch on a "0" prematurely and allow a downstream register with a late-arriving clock to latch on a "1" causing a bubble. Fig. 3.8 (a) shows an implementation view of the clock signal feeding into each Carry4 SLICEL register. This simplified representation may differ from the actual layout, but it illustrates the reason behind non-homogeneous path lengths. Fig, 3.8 (b) shows a post-implementation simulation of the average time from the global clock source to each SLICEL register clock input. Notice that register locations 1 and 3 within a SLICEL will always latch before the registers located at 2 and 4. A closer review of the measured bin lengths in Fig. 3.8 (b) reveals this same odd, even effect with every other odd bin at

zero length. By using a runtime Integrated Logic Analyzer (ILA), the TDL thermometer code is reviewed on the fly to look for the degree of bubble error or number of zeros before the final "1". Only the first-order error exists indicating that a simple "if" statement can be added to the "Highest '1' (native)" encoder to decrement the binary



Fig. 3.9. (a) Four-chain parallel TDL; (b) DNL comparison of 1-chain, 2-chain, and 4-chain TDC.

output by 1 if the highest "1" has a "0" just prior [75]. This allows the missed bins to fill properly as desired. Double buffering the first stage registers is also implemented for metastability improvement.

Multi-chain averaging, the last but most important technique for linearity improvement, is utilized to average the outputs of paralleled TDLs to produce the final output of the TDC [48]. Fig. 3.9 (a) illustrates the chain of 4 TDLs, and Fig. 3.9 (b) shows the DNL improvement from 1 chain to 4. The RMS resolution comparison of 1-chain and 4-chain TDC is presented in Fig. 3.10 and demonstrates an improvement of 38% by the 4-chain averaging method.



Fig. 3.10. RMS resolution comparison of 1-chain versus 4-chain TDC with X axis offset due to significant INL in 1 chain.

# **3.4 Measurement Results**

The 16-channel SiPM readout ASIC is designed and fabricated in a standard 180 nm 1P6M CMOS process. Fig. 3.11 shows the die photo of the ASIC, which occupies an area of 2860  $\mu$ m × 1200  $\mu$ m. Each channel has an averaged core area of 0.156 mm<sup>2</sup>, bringing an area reduction of 47% comparing with the single-channel design.



Fig. 3.11. Die photo of the 16-channel SiPM readout ASIC.



Fig. 3.12. Measured ADC output spectrum at 16 MS/s sampling rate with a near-Nyquist input signal.

Measured at 16 MS/s, the SAR ADC consumes 895  $\mu$ W of power from a 1.8 V supply. The output spectrum of the SAR ADC with a near-Nyquist input is shown in Fig. 3.12, which shows an SNDR of 51.37 dB and an SFDR of 58.34 dB. The ADC dynamic performances are summarized in Figs. 3.13 and 3.14. The SNDR stays above



Fig. 3.13. Measured SNDR and SFDR of ADC versus input frequency.



Fig. 3.14. Measured SNDR and SFDR of ADC versus sampling rate.

50 dB with input frequency up to 10 MHz @ 16 MS/s and maintains higher than 50 dB up to Nyquist sampling at 20 MS/s, allowing the ADC to be shared among all the 16 readout channels with a per-channel conversion period of 1  $\mu$ s.

The set-up for energy measurement is shown in Fig. 3.15. The Keysight M8190A AWG is used to generate the double exponential pulses to imitate the SiPM output pulses, and the Keysight MSO9404A is used to receive and store the ADC output data. The measured ADC output codes corresponding to different input charge levels



Fig. 3.15. SiPM readout ASIC measurement setup.



Fig. 3.16. Measured 16-channel averaged ADC output codes versus input charges.



Fig. 3.17. Measured 16-channel averaged DNL and INL in 800 pC input charge range.

among the 16 channels are shown in Fig. 3.16. With curve fitting for the 16-channel averaged output codes and the corresponding input charges, a maximum gain non-linearity (DNL) of 3.6% is achieved over the full input dynamic range with INL  $\leq$  4.6% as depicted in Fig. 3.17.

Table 3.1 summarizes the performance of the readout ASIC and compares it with other works. The proposed 16-channel readout ASIC demonstrates excellent linearity with the lowest power consumption and the highest conversion rate.

|                               | This<br>Work <sup>*</sup> | [15]** | [21]* | [22]** | [59]** | [73]* |
|-------------------------------|---------------------------|--------|-------|--------|--------|-------|
| Tech. [nm]                    | 180                       | 350    | 150   | 180    | 130    | 180   |
| No. Channels                  | 16                        | 8      | 16    | 64     | 1      | 16    |
| Area<br>[mm <sup>2</sup> /ch] | 0.156                     | 0.12   | 0.066 | 0.042  | 0.6    | 0.66  |
| Power<br>[mW/ch]              | 3.89                      | < 10   | < 15  | < 5    | 11     | 86    |
| Max Gain<br>Nonlinearity      | 3.6%                      | N/A    | N/A   | N/A    | 5%     | N/A   |
| Conversion<br>Period [µs]     | 1                         | 4      | 6.4   | N/A    | N/A    | 3.4   |

Table 3.1. Multi-channel SiPM Readout Performance Summary and Comparison

\* With on-chip ADC; \*\* Without on-chip ADC

The TDC is designed and synthesized on the Xilinx Kintex-7 device (XC7K325T-2FFG900C) in the KC705 evaluation kit, and the TDC measurement setup is shown in Fig. 3.18. The Tektronix AFG 3022C is adopted as the pulse generator to feed asynchronous clock pulses into the TDCs for timing measurement. The DNL and INL of the proposed TDC with post calibration including the above-mentioned linearity optimization approaches are -3.18/+3.46 ps and -4.65/+9.59 ps, respectively, as shown in Fig. 3.19. The timing performance of the proposed TDC is summarized and compared in Table 3.2, in which this work presents a superior DNL and INL as compared to all previous designs with a good average RMS resolution of 15 ps.



Fig. 3.18. FPGA-based TDC measurement setup.



Fig. 3.19. Measured DNL and INL of the 4-chain TDC.

|           | Tech<br>[nm] | Method                     | RMS<br>Resolution<br>[ps] | INL [ps]         | DNL [ps]        |
|-----------|--------------|----------------------------|---------------------------|------------------|-----------------|
| This Work | 28           | Multi-chain Averaging      | 15                        | -4.65,<br>+9.59  | -3.18,<br>+3.46 |
| [62]      | 650          | Vernier TDL                | 200                       | < 200            | -94, +88        |
| [63]      | 650          | Vernier TDL                | 129                       | N/A              | -144,<br>+214   |
| [64]      | 65           | Multi-phase Clocks         | 625                       | N/A              | 31.2            |
| [70]      | 90           | Ring Oscillators           | 40                        | N/A              | < 40            |
| [71]      | 350          | Ring Oscillators           | 50                        | ±65              | ±35.75          |
| [72]      | 20           | Dual Sampling              | 3.9                       | N/A              | N/A             |
| [77]      | 40           | Multi-chain Averaging      | 4.2                       | -28.7,<br>+18.2  | -2.9,<br>+11.72 |
| [78]      | 65           | Pure Carry TDL             | 17                        | -51,<br>+43.86   | -17,<br>+60.35  |
| [79]      | 65           | Pure Carry TDL             | 15                        | ±60              | -15, +45        |
| [80]      | 28           | Wave Union TDL             | 10                        | N/A              | N/A             |
| [81]      | 28           | Two-stage<br>Interpolation | 4.5                       | -22.04,<br>+7.64 | -0.93,<br>+7.44 |

Table 3.2. FPGA-based TDC Performance Summary and Comparison

# **CHAPTER IV**

# WIDEBAND NOISE AND HARMONIC DISTORTION CANCELING READOUT FOR HIGH-FREQUENCY ULTRASOUND AND PHOTOACOUSTIC IMAGING

In this chapter, a complementary resistive shunt-feedback low-noise amplifier (LNA) with both noise and distortion cancellations [49] is developed to simultaneously achieve wideband impedance matching, low noise, and high linearity for high-frequency ultrasound (HFUS) and photoacoustic (PA) imaging applications. An electrical model of polyvinylidene fluoride (PVDF) ultrasound transducer is also developed to guide the LNA design. Designed in a 180 nm CMOS technology, the LNA achieves a less than 0.8 nV/sqrt(Hz) input-referred voltage noise density, a 19 dB voltage gain, and a -59.3 dBc total harmonic distortion (THD) at 80 MHz. An ultrasound front-end with the proposed LNA and a pseudo-differential variable gain amplifier (VGA) is also designed and fabricated. At 80 MHz, measurements show the front-end achieves an input-referred voltage noise density of 1.36 nV/sqrt(Hz), a better than -16 dB input return loss, a voltage gain of 37 dB, and a THD of -55 dBc. The front-end consumes 37 mW of power from a 1.8 V supply and achieves a noise efficiency factor (NEF) of 2.66.

## 4.1 PVDF Ultrasound Transducer Model

To simulate the LNA circuit together with the ultrasound element, an electrical model of PVDF transducer for HFUS and PA imaging is developed. Fig. 4.1(a) depicts the schematic of the ultrasound transducer, which is made of a 9-µm-thick piezoelectric material PVDF film, whose resonance frequency and bandwidth are around 50~80 MHz and 75~140 MHz, respectively [82, 83, 84]. The materials of the transducer electrodes are Indium Tin Oxide (ITO) and Aluminum (Al). The fabrication process is conducted as follows: 1) A 9-µm-thick polarized PVDF film was cut into pieces of suitable sizes; 2) With RF sputtering, the 200-nm-thick ITO electrodes were formed on the two



Fig. 4.1. (a) Schematic diagram of the PVDF transducer; (b) photograph of the PVDF transducer and its electrical model.



Fig. 4.2. Measured output impedance of the PVDF transducer.

surfaces of the PVDF film; 3) With DC sputtering, the aluminum electrodes were formed on the surfaces of the PVDF film. The effective sensing region is the  $2 \times 2 \text{ mm}^2$  transparent area at the center. Fig. 4.1(b) shows the photograph of the PVDF transducer and its electrical model. Measured with an impedance analyzer, Fig. 4.2 presents the electrical output impedance of the transducer.

As shown in Fig. 4.3(a), to measure the photoacoustic response of the transducer, an optically absorptive target made of black tape was used. The target was put under the transducer effective sensing region. During the test, a small amount of water was added between the transducer and the target surface to improve the acoustic signal coupling efficiency. The 905-nm laser pulses (pulse width: 8 ns, pulse energy: 150 nJ/pulse,

repetition rate: 1 kHz) were shot through the transducer to the target. The PA signal excited from the surface of the target was detected by the transducer. The representative PA signal recorded by the transducer was as depicted in Fig. 4.3(b).



Fig. 4.3. (a) Measurement setup for the PA response of the PVDF transducer; (b) recorded PA signal induced by laser pulse onto black tape.

Given the typical resonance frequency of the PVDF transducer being around 50~80 MHz, the input impedance of the LNA should be designed to match at this frequency. The corresponding  $R_s$  as shown in Fig. 4.2 is about 50 ± 10  $\Omega$ . Also, considering the typical bandwidth of the PVDF transducer being 75~140 MHz around the resonant frequency, it is preferred that the LNA has a flat gain response from 30 MHz to 120 MHz. With a typical transducer-received PA signal of 1 mV as shown in Fig. 4.3(b), the LNA along with its post variable gain amplifier is designed to have about 40 dB voltage gain and a better than -50 dBc THD within the transducer resonant frequency range.

# 4.2 LNA Design and Analysis

#### 4.2.1 Resistive Shunt Feedback

The resistive shunt-feedback amplifier, as depicted in Fig. 4.4 [41], is chosen as the basic building block for the proposed high-frequency LNA. The resistive shunt-



Fig. 4.4. Resistive shunt-feedback amplifier enabling high bandwidth and wideband impedance matching.

feedback structure offers a high  $f_{-3dB}$  bandwidth and wideband impedance matching simultaneously.

Ignoring the effect of parasitic capacitance  $C_{gs}$  of the input transistor  $M_1$  and assuming that  $R_L \gg R_F$ , the input impedance of the shunt-feedback amplifier can be derived as

$$Z_{\rm IN} = \frac{R_{\rm F} + R_{\rm L}}{1 + g_{\rm m1} R_{\rm L}} \approx \frac{1}{g_{\rm m1}},$$
(4.1)

where  $g_{m1}$  is the transconductance of transistor  $M_1$ . The input impedance matching condition is then obtained as

$$R_{\rm S} = Z_{\rm IN} = \frac{1}{g_{\rm m1}},\tag{4.2}$$

where  $R_{\rm S}$  is the source impedance.

Next, we derive the signal gain of the amplifier  $\frac{V_{Y,S}}{V_S}$ . The gain from node *X* to node *Y* is derived as

$$\frac{V_{\rm Y,S}}{V_{\rm X,S}} = \frac{\left(1 - g_{\rm m1} R_{\rm F}\right) R_{\rm L}}{R_{\rm F} + R_{\rm L}} \approx 1 - g_{\rm m1} R_{\rm F}, \tag{4.3}$$

where  $V_{Y,S}$  and  $V_{X,S}$  are the signals at the amplifier output node and the gate node of transistor  $M_1$ , respectively. Then, under the input impedance matched condition the signal gain  $\frac{V_{Y,S}}{V_S}$  is derived as

$$\frac{V_{\rm Y,S}}{V_{\rm S}} = \frac{V_{\rm Y,S}}{V_{\rm X,S}} \times \frac{V_{\rm X,S}}{V_{\rm S}} = \left(1 - g_{\rm m1}R_{\rm F}\right) \frac{R_{\rm S}}{R_{\rm S} + Z_{\rm IN}} = \frac{1}{2} \left(1 - \frac{R_{\rm F}}{R_{\rm S}}\right). \tag{4.4}$$

The major noise components of the amplifier are the thermal noises of  $R_s$  and the amplifier input transistor  $M_1$ . The noise factor of the amplifier can be derived as [39],

$$F > 1 + \frac{4kT\gamma \times Z_{\rm IN}}{4kTR_{\rm S}} + \chi = 1 + \gamma + \chi > 2, \tag{4.5}$$

where *k* is the Boltzmann's constant, *T* is the temperature in Kelvins,  $\gamma$  is the channel thermal noise coefficient and  $1 < \gamma < 2$  for submicron n-channel MOSFETs [85], and  $\chi$  represents the flicker noise and the thermal noise induced by other parts of the circuit (e.g.,  $R_F$  and  $R_L$ ). From Eq. (4.5), the achievable noise figure (NF) is larger than 3 dB and is often larger than 5 dB practically. The noise performance of the wideband resistive shunt-feedback amplifier clearly needs to be improved.

#### 4.2.2 Feedforward Noise Cancellation

To improve the noise performance, feedforward noise cancellation is exploited. Fig. 4.5 [41] depicts the noise-canceling technique where an auxiliary amplifier is used to generate an in-phase signal and an out-of-phase noise with respect to those of the main amplifier  $M_1$ . The main amplifier  $M_1$  has an inverting signal gain as shown in Eq. (4.4). The noise gain of the main amplifier  $M_1$  can be derived as

$$\frac{V_{\rm Y,N}}{V_{\rm X,N}} = 1 + \frac{R_{\rm F}}{R_{\rm S}},\tag{4.6}$$

where  $V_{X,N}$  denotes the input-referred thermal noise of  $M_1$ , and  $V_{Y,N}$  is the corresponding output noise. From Eq. (4.6), the main amplifier  $M_1$  has a non-inverting gain for noise. To cancel the noise of  $M_1$ , the auxiliary amplifier, as shown in Fig. 4.5, is designed to have an inverting amplification for both the signal and the noise at node *X*. With the main amplifier and the auxiliary amplifier exhibit opposite noise gains, it is thus possible to achieve noise cancellation.



Fig. 4.5. Block diagram of feedforward noise-canceling technique.



Fig. 4.6. Noise-canceling resistive shunt-feedback LNA.

Fig. 4.6 [41] shows the transistor-level implementation of the noise-canceling resistive shunt-feedback LNA, where transistor  $M_1$  works as the main amplifier,  $M_2$  with a cascode structure works as the auxiliary amplifier, and  $M_4$  works as a source follower combining the outputs of both amplifiers. The main amplifier and the auxiliary amplifier are in parallel connected with respect to node *X*. The input impedance condition of the LNA can be derived to be the same as Eq. (4.2).

Ignoring the small gain reduction due to the source follower, the signal gain  $\frac{V_{Z,S,M}}{V_S}$  and the noise gain  $\frac{V_{Z,N,M}}{V_{X,N}}$  at node Z contributed by the main amplifier can be obtained as

$$\frac{V_{\rm Z,S,M}}{V_{\rm S}} = \frac{V_{\rm Y,S,M}}{V_{\rm S}} = \frac{1}{2} \left( 1 - \frac{R_{\rm F}}{R_{\rm S}} \right)$$
(4.7)

and

$$\frac{V_{Z,N,M}}{V_{X,N}} = \frac{V_{Y,N,M}}{V_{X,N}} = 1 + \frac{R_F}{R_S}.$$
(4.8)

The signal gain  $\frac{V_{Z,S,A}}{V_S}$  and the noise gain  $\frac{V_{Z,N,A}}{V_{X,N}}$  of the auxiliary amplifier are

derived as

$$\frac{V_{Z,S,A}}{V_S} = \frac{V_{Z,S,A}}{V_{X,S,A}} \times \frac{V_{X,S,A}}{V_S} = -\frac{1}{2} \frac{g_{m2}}{g_{m4}}$$
(4.9)

and

$$\frac{V_{Z,N,A}}{V_{X,N}} = -\frac{g_{m2}}{g_{m4}},$$
(4.10)

where  $g_{m2}$  and  $g_{m4}$  are the transconductance of transistors  $M_2$  and  $M_4$ , respectively.

With Eqs. (4.7-4.10), the noise-canceling condition at node Z can thus be obtained as

$$\frac{V_{Z,N,M}}{V_{X,N}} + \frac{V_{Z,N,A}}{V_{X,N}} = \left(1 + \frac{R_F}{R_S}\right) - \frac{g_{m2}}{g_{m4}} = 0,$$
  
$$\implies 1 + \frac{R_F}{R_S} = \frac{g_{m2}}{g_{m4}},$$
(4.11)

and the total signal gain  $A_{S,TOT}$  is given by

$$A_{\rm S,TOT} = \frac{V_{\rm Z,S,M}}{V_{\rm S}} + \frac{V_{\rm Z,S,A}}{V_{\rm S}} = -\frac{R_{\rm F}}{R_{\rm S}}.$$
 (4.12)

Under the noise-canceling condition Eq. (4.11), the noise of the main amplifier  $M_1$  is canceled by the auxiliary amplifier  $M_2$ . The input-referred noise of the overall amplifier is only determined by the auxiliary amplifier  $M_2$  and can be made small with a large  $g_{m2}$  without impairing the impedance matching condition which is only determined by  $g_{m1}$  as shown in Eq. (4.2). The LNA thus can achieve both wideband impedance matching and low-noise performance.

#### 4.2.3 Complementary CMOS Topology

To suppress the second-order harmonics of the single-ended ultrasound LNA structure, complementary CMOS topology is investigated [49]. As depicted in Fig. 4.7, the complementary CMOS amplifier consists of a PMOS-based resistive shunt-feedback amplifier  $M_P$  in parallel with an NMOS-based resistive shunt-feedback amplifier  $M_N$ . The drain current in each sub-amplifier as a function of the input signal is respectively obtained as [42]:

$$i_{\rm ds,P} = I_{\rm DS,P} + g_{\rm m,P}(-v_{\rm gs}) + \frac{1}{2!}g'_{\rm m,P}(-v_{\rm gs})^2 + \frac{1}{3!}g'_{\rm m,P}(-v_{\rm gs})^3 + \dots , \quad (4.13)$$

and 
$$i_{\rm ds,N} = I_{\rm DS,N} + g_{\rm m,N}(v_{\rm gs}) + \frac{1}{2!}g'_{\rm m,N}(v_{\rm gs})^2 + \frac{1}{3!}g''_{\rm m,N}(v_{\rm gs})^3 + \dots$$
 (4.14)

where  $g'_{m,P}$ ,  $g'_{m,N}$ ,  $g''_{m,P}$  and  $g''_{m,N}$  are the first-order and the second-order derivatives of the transconductances  $g_{m,P}$  and  $g_{m,N}$  with respect to the gate-to-source voltage  $v_{gs}$ , respectively.

By summing the two currents, the AC output current of the complementary amplifier is obtained as

$$i_{\text{out}} = i_{\text{ds},\text{N}} - i_{\text{ds},\text{P}} = (g_{\text{m},\text{N}} + g_{\text{m},\text{P}})(v_{\text{gs}}) + \frac{1}{2!} (g'_{\text{m},\text{N}} - g'_{\text{m},\text{P}})(v_{\text{gs}})^2 + \frac{1}{3!} (g''_{\text{m},\text{N}} + g''_{\text{m},\text{P}})(v_{\text{gs}})^3 + \dots$$

$$(4.15)$$



Fig. 4.7. Complementary resistive shunt-feedback amplifier.

From Eq. (4.15), it can be observed that the second-order harmonic can be largely canceled if  $g'_{m,P} \approx g'_{m,N}$ . The simulated drain currents ( $i_{ds}$ ) and its derivative characteristics ( $g_m$  and  $g'_m$ ) of a 240-µm/0.18-µm PMOS  $M_P$  and a 120-µm/0.18-µm NMOS  $M_N$  are shown in Fig. 4.8. With proper biasing, the  $g_{m,P}$  and  $g_{m,N}$  can be well defined to achieve  $g'_{m,P} \approx g'_{m,N}$ . As the second-order harmonic dominates the nonlinearity of the single-ended LNA, the proposed complementary CMOS amplifier structure thus provides an extremely low-cost solution to attain good linearity while avoiding the noise and power consumption penalties of a dedicated single-ended to differential conversion circuit.

With the complementary CMOS topology, the input impedance matching condition is jointly determined by both  $M_P$  and  $M_N$ , and can be obtained as

$$R_{\rm S} = Z_{\rm IN} = \frac{1}{g_{\rm m,P} + g_{\rm m,N}}.$$
(4.16)

The signal gain of the complementary resistive shunt-feedback amplifier is derived as

$$\frac{V_{Z,S}}{V_S} = \left(\frac{V_{YP,S}}{V_{X,S}} + \frac{V_{YN,S}}{V_{X,S}}\right) \times \frac{V_{X,S}}{V_S}$$

$$= \left[\left(1 - g_{m,P}R_F\right) + \left(1 - g_{m,N}R_F\right)\right] \frac{R_S}{R_S + Z_{IN}}$$

$$= \frac{1}{2}\left(2 - \frac{R_F}{R_S}\right).$$
(4.17)



Fig. 4.8. Simulated  $i_{ds}$ ,  $g_m$  and  $g'_m$  of a 240- $\mu$ m/0.18- $\mu$ m PMOS  $M_P$  and a 120- $\mu$ m/0.18- $\mu$ m NMOS  $M_N$  for the complementary resistive shunt-feedback amplifier.

## 4.2.4 Current-Reuse Technique

To reduce the power consumption of the LNA, the current-reuse technique developed for radio-frequency amplifiers [46] is investigated. Modifying the complementary amplifier in Fig. 4.7 by stacking the *P*-path amplifier with the *N*-path amplifier and removing the biasing current sources, the current-reuse resistive shunt-feedback amplifier is constructed and is shown in Fig. 4.9. The input impedance matching condition of the current-reuse amplifier is similar to Eq. (4.16) and is obtained

$$R_{\rm S} = Z_{\rm IN} = \frac{R_{\rm F} + r_{\rm out,P,CR} / / r_{\rm out,N,CR}}{1 + \left(g_{\rm m,P,CR} + g_{\rm m,N,CR}\right) \times r_{\rm out,P,CR} / / r_{\rm out,N,CR}}$$

$$\approx \frac{1}{g_{\rm m,P,CR} + g_{\rm m,N,CR}},$$
(4.18)

where  $g_{m,P,CR}$  and  $g_{m,N,CR}$  are the transconductance of transistors  $M_{P,CR}$  and  $M_{N,CR}$ , respectively, assuming  $r_{out,P,CR}//r_{out,N,CR} \gg R_F$ . Under the input impedance matched condition, the signal gain is derived with the superposition principle as

$$\frac{V_{Z,S}}{V_S} = \frac{V_{Z,S}}{V_{X,S}} \times \frac{V_{X,S}}{V_S} = \frac{1}{2} \left[ \left( 1 - g_{m,P,CR} R_F \right) + \left( 1 - g_{m,N,CR} R_F \right) \right]$$
$$= \frac{1}{2} \left( 2 - \frac{R_F}{R_S} \right), \tag{4.19}$$

which is the same as Eq. (4.17). In the current-reuse structure,  $M_{P,CR}$  and  $M_{N,CR}$  are carefully sized so that the output common-mode voltage is close to half  $V_{DD}$ . The input



Fig. 4.9. Current-reuse resistive shunt-feedback amplifier.

common-mode voltage is set by the output common-mode voltage through the feedback resistor  $R_{\rm F}$ . Compared with the complementary resistive shunt-feedback amplifier in Fig. 4.7, the current-reuse structure only requires half of the DC current to maintain the impedance matching condition, thus reducing the amplifier power consumption by half. Removing the biasing current sources also allow a low supply voltage to be used as long as  $M_{\rm P,CR}$  and  $M_{\rm N,CR}$  are in saturation, which helps to further reduce the amplifier power consumption.

Simulation results [86] show that the path stacking of the complementary resistive shunt-feedback amplifier brings 1 mA DC current reduction and the use of a 1.3 V supply further reduces the power consumption by 9 mW, while the LNA can still maintain  $f_{-3dB}$  bandwidth larger than 150 MHz. The use of a low  $V_{DD}$ , however, mandates large-sized transistors especially in the auxiliary amplifiers to maintain low-noise performance. This leads to a large parasitic capacitance at the amplifier input node, slightly degrading the amplifier bandwidth and  $S_{11}$  at high frequencies. Therefore, in designing the low- $V_{DD}$  current-reuse amplifier with feedforward noise-canceling, the  $V_{DD}$  needs to be carefully selected to tradeoff between noise, power saving, bandwidth, and high-frequency  $S_{11}$ .

## 4.3 LNA Design with Simulation Results

A resistive shunt-feedback LNA [49] with the feedforward noise-canceling technique and the complementary topology is designed in a 180 nm CMOS technology. The current-reuse structure is not included in the design to maintain the  $S_{11}$  better than -15 dB without using inductors to absorb the large capacitance at the amplifier input

node. As shown in Fig. 4.10, the complementary topology is formed by both the *N*-path and the *P*-path amplifiers, where each path employs resistive shunt-feedback with feedforward noise cancellation. In the figure,  $M_{1P}$  and  $M_{1N}$  are the main amplifiers,  $M_{2P}$ and  $M_{2N}$  are the auxiliary amplifiers, and  $M_{4P}$  and  $M_{4N}$  are the analog combiners, respectively. The sizes of the  $M_{1P}$  and  $M_{1N}$  transistors are 240-µm/0.18-µm and 120µm/0.18-µm, respectively. Both transistors are biased with 1 mA current. The corresponding  $g_{m,1P}$  and  $g_{m,1N}$  are 9.81 mA/V and 9.57 mA/V, respectively, leading to an input impedance  $Z_{1N}$  of about 51.6  $\Omega$ . The corresponding  $g'_{m,1P}$  is 67 mA/V<sup>2</sup> and  $g'_{m,1N}$  is 61 mA/V<sup>2</sup> and this helps to achieve the cancellation of the second-order



Fig. 4.10. Proposed resistive shunt-feedback LNA with the feedforward noisecanceling technique and the complementary topology.

harmonic distortion. The size of the  $M_{2P}$  and  $M_{2N}$  transistors are designed as 960µm/0.18-µm and 480-µm/0.18-µm with the corresponding  $g_{m,2P}$  being 36.3 mA/V and  $g_{m,2N}$  being 38.3 mA/V to achieve low-noise performance.



Fig. 4.11. Simulated  $S_{11}$  of the proposed LNA.



Fig. 4.12. Simulated frequency response of the proposed LNA.

Figs. 4.11 and 4.12 show the simulated  $S_{11}$  and frequency response of the LNA, respectively. The  $S_{11}$  is better than -17 dB over the desired ultrasound transducer operation frequency range of 30-120 MHz. The  $S_{11}$  at higher frequencies is slightly degraded by the main amplifier input parasitic capacitance. The frequency response shows that the LNA has a gain of 19 dB up to 120 MHz in the typical corner and a  $f_{-3dB}$  bandwidth of 770 MHz. The gain variation over process, voltage, and temperatures (PVTs) is less than 2 dB.

As shown in Fig. 4.13, with feedforward noise cancellation the amplifier inputreferred voltage noise density is reduced by about  $3 \times$  over PVTs. The input-referred voltage noise density is  $\leq 0.8$  nV/sqrt(Hz) over 30-120 MHz frequency range. The THD



Fig. 4.13. Simulated input-referred voltage noise density of the proposed LNA.

| $V_{\rm IN, PP}({ m mV})$ |                       | 0.5   | 1     | 2     |
|---------------------------|-----------------------|-------|-------|-------|
| $P_{\rm IN}$ (dBm)        |                       | -62   | -56   | -50   |
| THD<br>(dBc) @            | NC only (NMOS)        | -55.1 | -49.1 | -42.7 |
| 80 MHz                    | NC only (PMOS)        | -55.7 | -49.6 | -43.5 |
|                           | NC with Complementary | -65.3 | -59.3 | -53.7 |

Table 4.1. Simulated THD of the LNA

simulation is also performed, and the simulation results are summarized in Table 4.1. The signal generated by the transducer is in the range of 0.5-2 mV peak-to-peak, corresponding to an input signal power of -62 dBm to -50 dBm in a 50  $\Omega$  terminated system. As shown in Table 4.1, the complementary CMOS topology provides a THD improvement of larger than 9 dB across the input signal range.

## 4.4 Measurement Results

A high-frequency ultrasound and photoacoustic imaging readout ASIC as shown in Fig. 4.14, including the proposed LNA and a pseudo-differential VGA, has been



Fig. 4.14. Block diagram of the HFUS and PA readout ASIC.



Fig. 4.15. Die photo of the HFUS and PA readout ASIC.

developed and fabricated in a one-poly six-metal (1P6M) 180 nm bulk CMOS process with a core area of 380  $\mu$ m × 350  $\mu$ m. The gain of the VGA is programmable ranging from 20 to 32 dB with a 6-dB gain step. Fig. 4.15 shows the die photo of the front-end. Fig. 4.16 shows the measurements setup. The  $S_{11}$  and frequency response are measured with the Keysight N5247A network analyzer. The noise and the THD are measured with the Keysight E4438C signal generator and the Keysight N9040B signal analyzer.

The Gain Method [87] has been applied to measure the NF. The NF in the Gain Method can be expressed as

$$NF = P_{NOUTD} + 174 \text{ dBm/Hz} - A_V, \qquad (4.20)$$

where  $P_{\text{NOUTD}}$  is the measured output voltage noise density, 174 dBm/Hz denotes the noise density of 290°K ambient noise, and  $A_{\text{V}}$  is the measured front-end gain. Based on



(a)



Fig. 4.16. (a) Measurement setup; (b)  $S_{11}$  and frequency response measurement; and (c) noise and THD measurement.
the measured NF, the corresponding input-referred voltage noise density,  $e_{\rm NI}$ , in a 50  $\Omega$  terminated system is obtained as

$$e_{\rm NI} = \sqrt{4kT \times R_{50} \times \left[ \left( 10^{\rm NF/10} \right)^2 - 1 \right]}.$$
 (4.21)

The measured  $S_{11}$  is better than -16 dB over the frequency range of 30-120 MHz, as shown in Fig. 4.17. Setting the VGA with a 20 dB gain, the measured frequency response of the front-end is shown in Fig. 4.18. Due to the bandwidth limitation of the VGA, the measured  $f_{-3dB}$  bandwidth of the front-end is 89 MHz, which is close to the simulation result. The noise performance of the front-end over 30-120 MHz is shown in



Fig. 4.17. Measured  $S_{11}$  of the front-end.

Fig. 4.19. The measured input-referred voltage noise density of the front-end is 1.36 nV/sqrt(Hz) at 80 MHz, which closely matches to the simulated result of 1.26 nV/sqrt(Hz). Fig. 4.20 shows the measured THD of the front-end. With  $V_{IN, PP} = 1 \text{ mV}$ , the measured THD is better than -55 dBc over 30-80 MHz and degrades to -51 dBc at 120 MHz.

The noise efficiency factor [88] which considers the overall trade-off among noise, power consumption, and bandwidth is obtained for the proposed front end. The noise efficiency factor is defined as

NEF = 
$$V_{\rm NI,RMS} \cdot \sqrt{\frac{2I_{\rm TOT}}{\pi \cdot V_{\rm T} \cdot 4kT \cdot \rm BW}},$$
 (4.22)



Fig. 4.18. Measured frequency response of the front-end.

where  $V_{\text{NI,RMS}}$  is the total input-referred voltage noise,  $I_{\text{TOT}}$  is the total current drained by the circuit,  $V_{\text{T}}$  is the thermal voltage, and BW is the front-end bandwidth. The inputreferred noise of the front-end over 30-120 MHz is 14.2 µV and the  $I_{\text{TOT}}$  is 20.56 mA. The NEF of the front-end is thus determined as 2.66.



Fig. 4.19. Measured input-referred voltage noise density of the front-end.



Fig. 4.20. Measured THD of the front-end.

Table 4.2 summarized the performance of the front end and compares it with recently published ultrasound amplifiers. The front-end achieves a low input-referred voltage noise density, a low THD, a high  $f_{-3dB}$  bandwidth, and competitive power consumption, demonstrating the best NEF.

|                                                  | This<br>Work <sup>*</sup> | [89]* | [90]** | [91]** | [92]* | [93]* | [94]* | [95]* |
|--------------------------------------------------|---------------------------|-------|--------|--------|-------|-------|-------|-------|
| Tech. [nm]                                       | 180                       | 180   | 28     | 180    | 130   | 350   | 350   | 180   |
| Supply<br>Voltage [V]                            | 1.8                       | 1.8   | 1.0    | 1.8    | 3     | 3.3   | ±2.5  | 1.8   |
| Bandwidth<br>[MHz]                               | 90                        | 100   | 100    | 30     | 10    | 75    | 30    | 33    |
| Gain [dB]                                        | 37                        | 17.6  | 20     | 15.2   | 36    | 20    | 12    | 19.1  |
| Input-referred<br>Noise Density<br>[nV/sqrt(Hz)] | 1.36                      | 4.19  | 1.74   | 3.5    | 7.41  | 2.68  | 6.3   | 1.01  |
| Total Input-<br>referred Noise<br>[µV]           | 14.2                      | N/A   | 20.8   | 34.9   | 23.4  | N/A   | 35.6  | 5.8   |
| THD [dBc]                                        | -55                       | N/A   | N/A    | N/A    | N/A   | N/A   | N/A   | -53.5 |
| <i>S</i> <sub>11</sub> [dB]                      | -16                       | N/A   | N/A    | N/A    | N/A   | N/A   | N/A   | N/A   |
| Power<br>Consumption<br>[mW]                     | 37                        | 43    | 2      | 0.27   | 12.6  | N/A   | 20    | 16.2  |
| Noise<br>Efficiency<br>Factor                    | 2.66                      | N/A   | 3.57   | 3.02   | 18.51 | NA    | 15.36 | 3.69  |

Table 4.2. Ultrasound Front-end Performance Summary and Comparison

\* Measurement results; \*\* Simulation results.

# **CHAPTER V**

# **CONCLUSION AND FUTURE WORKS**

#### 5.1 Conclusion

CMOS readout ASICs for medical imaging applications and high-energy physics experiments are highly demanded in recent years, and this dissertation present the three high-performance CMOS readout ASICs for both SiPM detector and high-frequency ultrasound imaging applications. The following gives a summary of this dissertation.

The first readout ASIC presents the implementation of an on-chip high-pass filterbased fast signal generator and a customized successive-approximation-register ADC for SiPM readout. The on-chip filter sharpens the slow rising edge of standard SiPM signal, improving timing resolution and facilitating fast timing measurements while avoiding the power, area, and cost penalties as compared to the off-chip approach. Measurement results show that a 15 ps improvement in timing resolution is achieved by the on-chip HPF. Employing the SiPM charge integrator as the ADC track-and-hold sampler to save the power consumption by 9%, the customized SAR ADC consumes 147  $\mu$ W of power and achieves an SNDR of 53.08 dB and an SFDR of 62.74 dB at 1 MS/s. Additionally, a currentmode readout front-end with a current-feedback low-input impedance current buffer is developed. The readout ASIC achieves a timing resolution of 151 ps, and a maximum gain nonlinearity of 3.3% over an input charge range of 800 pC, while dissipating 4.02 mW of power from a 1.8V supply. The on-chip fast signal generator and the customized SAR ADC facilitate SiPM readout with improved timing and power consumption performance without increasing the system cost, area, and design complexity.

The second readout ASIC presents a low-power and highly-linear DAQ system for SiPM detectors featuring a 16-channel readout ASIC and a 32-channel FPGA-based TDC. The current-mode front-end of the readout utilizes a low input impedance current buffer, a programmable cascode current mirror, a stability enhanced current discriminator and a dynamic voltage comparator to achieve accurate charge readout over a large dynamic range without degrading the rising edge of the SiPM signal. A SAR ADC that is shared by the 16 readout channels in a time-multiplexed manner is developed. The shared ADC mechanism helps to significantly reduce the overall chip area and power consumption of the readout ASIC. Linearity optimization of the FPGAbased TDC is addressed with implementation of multi-chain averaging, bin realignment, avoidance of clock region crossing, and use of a synchronized-enable pulse detector. The current-mode readout ASIC is implemented in a 180 nm CMOS technology and achieves a maximum gain nonlinearity of less than 3.6% over an input charge range of 800 pC and 1 µs conversion time, while only dissipating 3.89 mW of power per readout channel. The FPGA-based TDC achieves a 15 ps RMS resolution, a DNL of less than 4 ps, and an INL of less than 10 ps.

This third readout ASIC presents a low-noise amplifier for high-frequency ultrasound and photoacoustic imaging applications. The LNA employs a resistive shunt-feedback configuration to simultaneously achieve a large  $f_{-3dB}$  bandwidth and a wideband impedance matching. To mitigate the noise in the resistive shunt-feedback amplifier, a feedforward noise-canceling technique is developed. A complementary CMOS topology is also developed to cancel the amplifier second-order nonlinear distortion. An ultrasound receiver front-end including the proposed LNA and a pseudo-differential VGA has been fabricated in a standard 180 nm CMOS technology. Measured at 80 MHz, the front-end

achieves an input-referred voltage noise density of 1.36 nV/sqrt(Hz), a -16.4 dB input return loss, a 37 dB voltage gain, and a -55 dBc THD, while consuming 37 mW of power. The front-end demonstrates the best NEF with a large f-3dB bandwidth, wideband impedance matching, low noise and low harmonic distortion, and competitive power consumption, making it suitable for high-frequency ultrasound and photoacoustic imaging applications.

### 5.2 Future Directions

The demand of CMOS readout ASICs for signal conditioning applications will never end. There are a few topics that would be interesting to address in future work.

High-linearly high-resolution TDCs are needed in the SiPM readout system to maintain the high timing resolution. One on-chip high-speed TDC shared among multiple SiPM readout channels, similar to the ADC sharing structure presented in Chapter III, can further help to reduce the output pins of multi-channel SiPM readout ASICs, and make the whole readout system be more compact. Besides, CMOS-based on-chip SiPM detector, which requires special imaging technology for SPAD generation, is another approach to improve the integration of SiPM readout system. In addition, there are many design techniques (such as loop-unrolled comparison, asynchronous SAR logic, low- $V_{DD}$  SAR, etc.) that can be explored to future reduce the power consumption of the customized ADC.

As mentioned in Section 4.4, the  $S_{11}$  of the LNA at higher frequencies is degraded due to the large parasitic capacitance at the amplifier input node. One properly sized inductor, serial-connected to the input node of the LNA, can be applied to improve the  $S_{11}$  performance at specific high frequencies. Besides, low-power ADCs are also necessary for the ultrasound readout system to convert the analog signals into digital signals for further imaging reconstruction, as shown in Fig. 1.8. Therefore, design a low-power SAR ADC for ultrasound readout applications can be considered further to fulfill the entire ultrasound readout system.

# REFERENCES

- A. Shukla, and U. Kumar, "Positron emission tomography: an overview," *Journal of Medical Physics*, vol. 31, no. 1, pp. 13–21, Jan. 2006.
- [2] E Garutti, "Silicon photomultipliers for high energy physics detectors," *Journal of Instrumentation*, vol. 6, C10003, pp. 1–14, Oct. 2011.
- [3] L. Braga, L. Gasparini, L. Grant, R. Henderson, N. Massari, M. Perenzoni, D. Stoppa, and R. Walker, "A fully digital 8 × 16 SiPM array for PET applications with per-pixel TDCs and real-time energy output," *IEEE Journal of Solid-State Circuits*, vol. 49, no. 1, pp. 301–314, Jan. 2014.
- [4] M. Perenzoni, D. Perenzoni, and D. Stoppa, "A 64 × 64-pixels digital silicon photomultiplier direct TOF sensor with 100-MPhotons/s/pixel background rejection and imaging/altimeter mode with 0.14% precision up to 6 km for spacecraft navigation and landing," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 1, pp. 151–160, Jan. 2017.
- [5] N. Otte, "The silicon photomultiplier: a new device for high energy physics, astroparticle physics, industrial and medical applications," in *Proceeding of International Symposium on the Development of Detectors for Particle, Astro-Particle and Synchrotron Radiation Experiments*, Apr. 2016, pp. 1–9.
- [6] A. Guerra, N. Belcari, M. Bisogni, G. Llosá, S. Marcatili, and S.Moehrs,
   "Advances in position-sensitive photodetectors for PET applications," *Nuclear Instruments and Methods in Physics Research. Section A, Accelerators,*

Spectrometers, Detectors and Associated Equipment, vol. 604, no. 1–2, pp. 319–322, Jun. 2009.

- [7] Onsemi, "Introduction to the silicon photomultiplier (SiPM)," accessed: Aug.2021. [Online]. Available: <u>https://www.onsemi.com</u>.
- [8] P. Trigilio, "Development of an ASIC for SiPM readout in SPECT applications,"
   Ph.D. dissertation, Department of Electronics, Information and Bioengineering (DEIB), Politecnico di Milano, Milan, Italy, Jan. 2016.
- [9] B. Aull, A. Loomis, D. Young, R. Heinrichs, B. Felton, P. Daniels, and D. Landers,
   "Geiger-mode avalanche photodiodes for three-dimensional imaging," *Lincoln Laboratory Journal*, vol. 13, no. 2, pp. 335–349, 2002.
- [10] R. Mao, L. Zhang, and R. Zhu, "Optical and scintillation properties of inorganic scintillators in high energy physics," *IEEE Transactions on Nuclear Science*, vol. 55, no. 4, pp. 2425–2431, Aug. 2008.
- [11] T.H. Wilmshurst, Signal Recovery from Noise in Electronic Instrumentation, Bristol, U.K. Hilger, 1990.
- [12] S. Dolinsky, G. Fu, and A. Ivan, "Timing resolution performance comparison for fast and standard outputs of SensL SiPM," in *Proceeding of IEEE Nuclear Science Symposium and Medical Imaging Conference*, Oct. 2013, pp. 1–6.
- [13] Hamamatsu, "S13360 series multi-pixel photon counters datasheet," accessed:Aug. 2021. [Online]. Available: <u>https://www.hamamatsu.com</u>.
- [14] Onsemi, "C-series low noise, blue-sensitive silicon photomultipliers datasheet," accessed: Aug. 2021. [Online]. Available: <u>https://www.onsemi.com</u>.

- [15] Z. Deng, A. Lan, X. Sun, C. Bircher, Y. Liu, and Y. Shao, "Development of an eight-channel time-based readout ASIC for PET applications," *IEEE Transactions on Nuclear Science*, vol. 58, no. 6, pp. 3212–3218, Dec. 2011.
- [16] C. Piemonte, "A new silicon photomultiplier structure for blue light detection," Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, vol. 568, no. 1, pp. 224–232, Nov. 2006.
- [17] Diagnostic Nuclear Medicine, "Energy resolution," accessed: Aug. 2021.[Online]. Available: http://www.medimaging.gr/cd/pages/par3.htm.
- [18] M. Grodzicka, M. Moszyński, T. Szcześniak, M. Kapusta, M. Szawłowski and D. Wolski, "Energy resolution of scintillation detectors with SiPM light readout," in *Proceeding of IEEE Nuclear Science Symposium & Medical Imaging Conference*, Oct. 2010, pp. 1940–1948.
- [19] Saint-Gobain Crystals, "LYSO scintillation material," accessed: Aug. 2021.[Online]. Available: https://www.crystals.saint-gobain.com.
- [20] M. Grodzicka, M. Moszyński, T. Szcześniak, M. Kapusta, M. Szawłowski and D. Wolski, "Energy resolution of small scintillation detectors with SiPM light readout," *Journal of Instrumentation*, vol. 8, no. 2, Feb. 2013, pp. 1–17.
- [21] H. Xu, M. Perenzoni, N. Massari, A. Gola, A. Ferri, and D. Stoppa, "A 30-ns recovery time, 11.5-nC input charge range, 16-channel read-out ASIC for PET application," in *Proceeding of IEEE European Solid-State Circuits Conference*, Sep. 2015, pp. 360–363.

- [22] X. Zhu, Z. Deng, Y. Chen, Y. Liu, and Y. Liu, "Development of a 64-channel readout ASIC for an 8 × 8 SSPM array for PET and TOF-PET applications," *IEEE Transactions on Nuclear Science*, vol. 63, no. 3, pp. 1327–1334, Jun. 2016.
- [23] W. W. Moses, M. Janecek, M. A. Spurrier, P. Szupryczynski, W. S. Choong, C. L. Melcher, and M. Andreaco, "Optimization of an LSO-based detector module for time-of-flight PET," *IEEE Transactions on Nuclear Science*, vol. 57, no. 3, pp. 1570–1576, Jun. 2010.
- [24] T. Harion, K. Briggl, H. Chen, P. Fischer, A. Gil, V. Kiworra, M Ritzert, H. C. Schultz-Coulon, W. Shen and V Stankova, "STiC—a mixed mode silicon photomultiplier readout ASIC for time-of-flight applications," *Journal of Instrumentation*, vol. 9, no. 2, pp. 1–8, Feb. 2014.
- [25] J. Yeom, R. Yeol, Vinke, and C. S. Levin, "Optimizing timing performance of silicon photomultiplier-based scintillation detectors," *Physics in Medicine & Biology*, vol. 58, no. 4, pp. 1207–1220, Feb. 2013.
- [26] M. Biroth, P. Achenbach, W. Lauth and A. Thomas, "Characterization of SiPM properties at liquid nitrogen temperature," in *Proceeding of IEEE Nuclear Science Symposium, Medical Imaging Conference and Room-Temperature Semiconductor Detector Workshop*, Oct. 2016, pp. 1–3.
- [27] K. Shung, "High frequency ultrasonic imaging," *Journal of Medical Ultrasound*, vol. 17, no. 1, pp. 25–30, Mar. 2009.
- [28] C. Chandrana, J. Talman, T. Pan, S. Roy, and A. Fleischman, "Design and analysis of MEMS based PVDF ultrasonic transducers for vascular imaging," *Sensors*, vol. 10, no. 9, pp. 8740–8750, Sep. 2010.

- [29] R. Bitton, R. Zemp, J. Yen, L. Wang, and K. Shung, "A 3-D high-frequency array based 16 channel photoacoustic microscopy system for In vivo micro-vascular imaging," *IEEE Transactions on Medical Imaging*, vol. 28, no. 8, pp. 1190–1197, Aug. 2009.
- [30] K. Daoudi, B. Kersten, C. van den Ende, F. van den Hoogen, M. Vonk, and C. de Korte, "Photoacoustic and high-frequency ultrasound imaging of systemic sclerosis patients," *Arthritis Research & Therapy*, vol. 23, no. 22, pp. 1–8, Jan. 2021.
- [31] J. Scampini, "Optimizing ultrasound receiver VGA output-referred noise and gain," *Maxim's Engineering Journal*, vol. 60, no. 2, pp. 9–13, Jun. 2007.
- [32] B. Mika, "Design and testing of piezoelectric sensors," Master thesis, Mechanical Engineering, Texas A&M University, College Station, USA, Aug. 2007.
- [33] A. Carovac, F. Smajlovic, and D. Junuzovic, "Application of ultrasound in medicine," *Acta Informatica Medica*, vol. 19, no. 3, pp. 168–171, Sep. 2011.
- [34] A. Carazo, "Novel piezoelectric transducers for high voltage measurements," Ph.D. dissertation, Industrial Engineering, Universitat Politècnica de Catalunya, Barcelona, Spain, Jan. 2000.
- [35] Wikipedia, "Piezoelectric sensor," accessed: Sep. 2021. [Online]. Available: <u>https://en.wikipedia.org/wiki/Piezoelectric\_sensor</u>.
- [36] P. O'Connor and G. Geronimo, "Prospects for charge sensitive amplifiers in scaled CMOS," in *Proceeding of IEEE Nuclear Science Symposium and Medical Imaging Conference*, Oct. 1999, pp. 88–93.

- [37] X. Li, Q. Zhang, and Y. Sun, "A low noise charge sensitive amplifier with adjustable leakage compensation in 0.18µm CMOS process," *in Proceeding of IEEE International Conference of Electron Devices and Solid-State Circuits*, Dec. 2010, pp. 1–4.
- [38] T. Chang, J. Chen, L. A. Rigge, and J. Lin, "ESD-protected wideband CMOS LNAs using modified resistive feedback techniques with chip-on-board packaging," *IEEE Transactions on Microwave Theory and Techniques*, vol. 56, no. 8, pp. 1817–1826, Aug. 2008.
- [39] F. Bruccoleri, E. A. M. Klumperink, and B. Nauta, "Wide-band CMOS low-noise amplifier exploiting thermal noise canceling," *IEEE Journal of Solid-State Circuits*, vol. 39, no. 2, pp. 275–282, Feb. 2004.
- [40] J. Eriksrod, and T. Ytterdal, "A 65nm CMOS front-end LNA for medical ultrasound imaging with feedback employing noise and distortion cancellation," in *Proceeding of European Conference on Circuit Theory and Design*, Sep. 2013, pp. 1–4.
- [41] Y. Tang, Y. Feng, Z. Zuo, Q. Fan, C. Fang, J. Zou, and J. Chen, "Wideband LNA with 1.9 dB noise figure in 0.18 μm CMOS for high frequency ultrasound imaging applications," in *Proceeding of IEEE New Circuits and Systems Conference*, Jun. 2016, pp. 1–4.
- [42] I. Nam, B. Kim, and K. Lee, "CMOS RF amplifier and mixer circuits utilizing complementary characteristics of parallel combined NMOS and PMOS devices," *IEEE Journal of Solid-State Circuits*, vol. 53, no. 5, pp. 1662–1671, May 2005.

- [43] X. Zhang, Z. Chen, Y. Gao, F. Ma, J. Hao, G. Zhu, and B. Chi, "An interferencerobust reconfigurable receiver with automatic frequency-calibrated LNA in 65nm CMOS," *IEEE Transactions on Very Large Scale Integration Systems*, vol. 25, no. 11, pp. 3113–3124, Nov. 2017.
- [44] C. Geha, C. Nguyen, and J. Martinez, "A wideband low-power-consumption 22– 32.5-GHz 0.18- μm BiCMOS active balun-LNA with IM<sub>2</sub> cancellation using a transformer-coupled cascode-cascade topology," *IEEE Transactions on Microwave Theory and Techniques*, vol. 65, no. 2, pp. 536–547, Feb. 2017.
- [45] M. Rahman, and R. Harjani, "A sub-1V, 2.8 dB NF, 475 μW coupled LNA for internet of things employing dual-path noise and nonlinearity cancellation," in *Proceeding of IEEE Radio Frequency Integrated Circuits Symposium*, Jun. 2017, pp. 236–239.
- [46] T. Taris, J. B. Begueret, and Y. Deval, "A low voltage current reuse LNA in a 130 nm CMOS technology for UWB applications," in *Proceeding of European Microwave Integrated Circuit Conference*, Oct. 2007, pp. 307–310.
- [47] Y. Tang, Q. Fan, Y. Feng, H. Deng, R. Zhang, and J. Chen, "A low-power SiPM readout front-end with fast pulse generation and successive-approximation register ADC in 0.18 μm CMOS," in *Proceeding of IEEE International Symposium on Circuits and Systems*, May 2019, pp. 1–4.
- [48] Y. Tang, T. Townsend, H. Deng, Y. Liu, R. Zhang, and J. Chen, "A highly linear FPGA-based TDC and a low-power multichannel readout ASIC with a shared SAR ADC for SiPM detectors," *IEEE Transactions on Nuclear Science*, vol. 68, no. 8, pp. 2286–2293, Aug. 2021.

- [49] Y. Tang, Y. Feng, Q. Fan, C. Fang, J. Zou, and J. Chen, "A wideband complementary noise and distortion canceling LNA for high-frequency ultrasound imaging applications," in Proceeding of *IEEE Texas Symposium on Wireless and Microwave Circuits and Systems*, Apr. 2018, pp. 1–4.
- [50] F. Corsi, C. Marzocca, M. Foresta, G. Matarrese, A. Del Guerra, S. Marcatili, G. Llosa, G. Collazuol, G. F. Dalla Betta, and C. Piemonte, "Preliminary results from a current mode CMOS front-end circuit for silicon photomultiplier detectors," in *Proceeding of IEEE Nuclear Science Symposium Conference Record*, Oct. 2007, pp. 360–365.
- [51] T. Tsai, J. Hong, L. Wang, and S. Lee, "Low-Power Analog Integrated Circuits for Wireless ECG Acquisition Systems," *IEEE Transactions on Information Technology in Biomedicine*, vol. 16, no. 5, pp. 907–917, Sep. 2012.
- [52] D. Moni and S. Jose, "Design of 10b SAR ADC for biomedical applications," in Proceeding of International Conference on Electronics and Communication Systems, Feb. 2015, pp. 276–281.
- [53] D. Zhang, A. Bhide and A. Alvandpour, "A 53-nW 9.1-ENOB 1-kS/s SAR ADC in 0.13 μm CMOS for medical implant devices," *IEEE Journal of Solid-State Circuits*, vol. 47, no. 7, pp. 1585–1593, July 2012.
- [54] B. Razavi, "The StrongARM Latch [A Circuit for All Seasons]," *IEEE Solid-State Circuits Magazine*, vol. 7, no. 2, pp. 12–17, Jun. 2015.
- [55] G. Fernández, D. Gascón, and J. Rosa, "Design of a 9-bit 4MS/s Wilkinson ADC for SiPM-based imaging detectors," in *Proceeding of IEEE International Midwest Symposium on Circuits and Systems*, Oct. 2016, pp. 1–4.

- [56] E. Delagnes, D. Breton, F. Lugiez, and R. Rahmanifard, "A low power multichannel single ramp ADC with up to 3.2 GHz virtual clock," *IEEE Transactions on Nuclear Science*, vol. 54, no. 5, pp. 1735–1742, Oct. 2007.
- [57] W. Gao, C. Guo, T. Wei, D. Gao, and Y. Hu, "A 12-bit 2.5MS/s multi-channel ramp analog-to-digital converter for imaging detectors," in *Proceeding of IEEE International Workshop on Imaging Systems and Techniques*, May 2009, pp. 183– 186.
- [58] G. Montagnani, F. Sancandi, G. Cozzi, C. Fiorini, L. Buonanno, M. Carminati, "GAMMA: an 8-channel high dynamic range ASIC for SiPM-based readout of large scintillators," in *Proceeding of IEEE Nuclear Science Symposium and Medical Imaging Conference*, Oct. 2017, pp. 1–3.
- [59] P. A. P. Calò, S. Petrignani, C. Marzocca, B. Markovic, A. Dragone, "A CMOS front-end for timing and charge readout of silicon photomultipliers," in *Proceeding of IEEE Nuclear Science Symposium and Medical Imaging Conference*, Oct. 2019, pp. 1–5.
- [60] X. Liu, L. Ma, J. Xiang, N. Yan, H. Xie, and X. Cai, "A low power TDC with 0.5ps resolution for ADPLL in 40nm CMOS," in *Proceeding of IEEE International Conference on ASIC*, Nov. 2015, pp. 1–4.
- [61] D. Schug, V. Nadig, B. Weissler, P. Gebhardt, and V. Schulz, "Initial measurements with the PETsys TOFPET2 ASIC evaluation kit and a characterization of the ASIC TDC," *IEEE Transactions on Radiation and Plasma Medical Sciences*, vol. 3, no. 4, pp. 444–453, Jul. 2019.

- [62] J. Kalisz, R. Szplet, J. Pasierbinski, and A. Poniecki, "Field-programmable-gatearray-based time-to-digital converter with 200-ps resolution," *IEEE Transaction on Instrumentation and Measurement*, vol. 46, no. 1, pp. 51–55, Feb. 1997.
- [63] J. Kalisz, R. Szplet, R. Pelka, and A. Poniecki, "Single-chip interpolating time counter with 200-ps resolution and 43-s range," *IEEE Transaction on Instrumentation and Measurement*, vol. 46, no. 4, pp. 851–856, Aug. 1997.
- [64] A. Balla, M. Beretta, P. Ciambrone, M. Gatta, F. Gonnella, L. Iafolla, M. Mascolo,
  R. Messi, D. Moricciani, and D. Riondino, "The characterization and application of a low resource FPGA-based time to digital converter," *Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment*, vol. 739, pp. 75–82, Jan. 2014.
- [65] S. Henzler, "Time-to-digital converter basics," *Time-to-Digital Converters* (Springer Series in Advanced Microelectronics), Springer, Dordrecht, Netherlands, pp. 5–18, Jan. 2010.
- [66] Y. Cao, P. Leroux, and M. Steyaert, "Background on time-to-digital converters," *Radiation-Tolerant Delta-Sigma Time-to-Digital Converters*, pp. 15–23, Springer, Cham, Switzerland, Feb. 2015.
- [67] M. Zhang, H. Wang, and Y. Liu, "A 7.4 ps FPGA-based TDC with a 1024-unit measurement matrix," *Sensors*, vol. 17, no. 865, pp. 1–18, Apr. 2017.
- [68] J. Wu and Z. Shi, "The 10-ps wave union TDC: improving FPGA TDC resolution beyond its cell delay," in *Proceeding of IEEE Nuclear Science Symposium Conference Record*, Oct. 2008, pp. 3440–3446.

- [69] Q. Shen, L. Zhao, S. Liu, S. Liao, B. Qi, and X. Hu, C. Peng, and Q. An, "A fast improved fat tree encoder for wave union," *Chinese Physics C*, vol. 37, no. 10, pp. 102–106, Oct. 2013.
- [70] S. Junnarkar, P. O'Connor, and R. Fontaine, "FPGA based self-calibrating 40 picosecond resolution, wide range time to digital converter," in *Proceeding of IEEE Nuclear Science Symposium Conference Record*, pp. 3434–3439, Oct. 2008.
- [71] A. Muntean, "Design of a fully digital analog SiPM with sub-50ps time conversion," Master's Thesis, Department of Microelectronics, Delft University of Technology, Delft, Netherlands, Nov. 2017.
- [72] Y. Wang, and C. Liu, "A 3.9 ps time-interval RMS precision time-to-digital converter using a dual-sampling method in an ultrascale FPGA," *IEEE Transactions on Nuclear Science*, vol. 63, no. 5, pp. 2617–2621, Jul. 2016.
- [73] P. Fischer, I. Peric, M. Ritzert, and M. Koniczek, "Fast self-triggered multichannel readout ASIC for time- and energy measurement," *IEEE Transactions on Nuclear Science*, vol.56, no. 3, pp. 1153–1158, Jun. 2009.
- [74] H. Homulle and E. Charbon, "Basic FPGA TDC design," TUDelft, accessed: Aug.2020 [Online]. Available: <u>https://cas.tudelft.nl</u>.
- [75] Z. Jaworski, "Verilog HDL model based thermometer-to-binary encoder with bubble error correction," in *Proceeding of International Conference Mixed Design of Integrated Circuits and Systems*, pp. 249–254, Jun. 2016.
- [76] T. Townsend, "FPGA-based data acquisition system for SiPM detectors," Master's Thesis, Department of Electrical and Computer Engineering, University of Houston, Houston, Texas, United States, May 2019.

- [77] Q. Shen, S. Liu, B. Qi, Q. An, S. Liao, P. Shang, C. Peng, and W. Liu, "A 1.7 ps equivalent bin size and 4.2 ps RMS FPGA TDC based on multichain measurements averaging method," *IEEE Transactions on Nuclear Science*, vol. 62, no. 3, pp. 947–954, Jun. 2015.
- [78] C. Favi, and E. Charbon, "A 17ps time-to-digital converter implemented in 65nm FPGA technology," in *Proceeding of International Symposium on Field Programmable Gate Arrays*, Jan. 2009, pp. 22–24.
- [79] L. Zhao, X. Hu, S. Liu, J. Wang, Q. Shen, H. Fan, and Q. An, "The design of a 16-channel 15 ps TDC implemented in a 65 nm FPGA," *IEEE Transactions on Nuclear Science*, vol. 60, no. 5, pp. 3532–3536, Oct. 2013.
- [80] C. Liu, and Y. Wang, "A 128-channel, 710 M samples/second, and less than 10 ps RMS resolution time-to-digital converter implemented in a Kintex-7 FPGA," *IEEE Transactions on Nuclear Science*, vol. 62, no. 3, pp. 773–783, Jun. 2015.
- [81] R. Szplet, P. Kwiatkowski, Z. Jachna, and K. Różyc, "An eight-channel 4.5-ps precision timestamps-based time interval counter in FPGA chip," *IEEE Transactions on Instrumentation and Measurement*, vol. 65, no. 9, pp. 2088–2100, Sep. 2016.
- [82] W. Zou, S. Holland, K. Kim, and W. Sachse, "Wideband high-frequency line-focus PVDF transducer for materials characterization," *Ultrasonics*, vol. 41, no. 3, pp. 157–161, May 2003.
- [83] F. Foster, K. Harasiewicz and M. Sherar, "A history of medical and biological imaging with polyvinylidene fluoride (PVDF) transducers," *IEEE Transactions*

on Ultrasonics, Ferroelectrics, and Frequency Control, vol. 47, no. 6, pp. 1363–1371, Nov. 2000.

- [84] S. Smolorz, and W. Grill, "Focusing PVDF transducers for acoustic microscopy," *Research in Nondestructive Evaluation*, vol. 7, no. 4, pp. 195–201, Dec. 1996.
- [85] A. Scholten, H. Tromp, L. Tiemeijer, R. Van Langevelde, R. Havens, P. De Vreede, R. Roes, P. Woerlee, A. Montree, and D. Klaassen, "Accurate thermal noise model for deep-submicron CMOS," in *Proceeding of International Electron Devices Meeting Technical Digest*, Dec. 1999, pp. 155–158.
- [86] Y. Tang, Y. Feng, Q. Fan, R. Zhang, and J. Chen, "A current reuse wideband LNA with complementary noise and distortion cancellation for ultrasound imaging applications," in *Proceeding of IEEE Asia Pacific Conference on Circuits and Systems*, Oct. 2018, pp. 171–174.
- [87] Maxim Integrated, "Noise figure measurement methods and formulas," accessed:Aug. 2021. [Online]. Available: <u>https://www.maximintegrated.com</u>.
- [88] M. Steyaert, W. Sansen, and Z. Chang, "A micropower low-noise monolithic instrumentation amplifier for medical purposes," *IEEE Journal of Solid-State Circuits*, vol. 22, no. 6, pp. 1163–1168, Dec. 1987.
- [89] J. Yoon, S. Lee, J. Kim, N. Song, J. Koh, and J. Choi, "Low-noise amplifier path for ultrasound system applications," in *Proceeding of IEEE Asia Pacific Conference on Circuits and Systems*, Dec. 2010, pp. 244–247.
- [90] L. Luo, Y. Wu, J. Diao, F. Ye and J. Ren, "Low power low noise amplifier with DC offset correction at 1 V supply voltage for ultrasound imaging systems," in

Proceeding of IEEE International Midwest Symposium on Circuits and Systems, Aug. 2018, pp. 137–140.

- [91] J. Diao, S. Li, Y. Wu, F. Ye, J. Xu and J. Ren, "Energy-efficient analog frond-end design for ultrasound imaging applications," in *Proceeding of IEEE International Midwest Symposium on Circuits and Systems*, Dallas, Aug. 2019, pp. 1171–1174.
- [92] Y. Wang, M. Koen, and D. Ma, "Low-noise CMOS TGC amplifier with adaptive gain control for ultrasound imaging receivers," *IEEE Transactions on Circuits* and Systems. II, Express Briefs, vol. 58, no. 1, pp. 26–30, Jan. 2011.
- [93] I. Kim, H. Kim, F. Griggio, R. L. Tutwiler, T. N. Jackson, S. Trolier, McKinstry, and K. Choi, "CMOS ultrasound transceiver chip for high resolution ultrasonic imaging systems," *IEEE Transactions on Biomedical Circuits and Systems*, vol. 3, no. 5, pp. 293–303, Oct. 2009.
- [94] L. Lay, S. Carey, and J. Hatfield, "Pre-amplifier arrays for intraoral ultrasound probe receiving electronics," in *Proceeding of IEEE Ultrasonics Symposium*, Aug. 2005, pp. 1753–1756.
- [95] S. Jung, S. Hong, and O. Kwon, "Low-power low-noise amplifier using attenuation-adaptive noise control for ultrasound imaging systems," *IEEE Transactions on Biomedical Circuits and Systems*, vol. 11, no. 1, pp. 108–116, Feb. 2017.