© Copyright by Pendyala, Praveen 2017 All Rights Reserved

# DESIGN AND SIMULATION OF HIGH SPEED LOW-POWER DUAL-MODE (NRZ/PAM4) (12.8Gbps/25.6Gbps) SERIALIZER AND VCSEL DRIVER IN TSMC 65nm TECHNOLOGY

A Thesis Presented to the Faculty of the Department of Electrical and Computer Engineering University of Houston

> In Partial Fulfillment of the Requirements for the Degree Master of Science in Computer Systems Engineering

by Praveen Gayatree Pendyala May 2017

### DESIGN AND SIMULATION OF HIGH SPEED LOW-POWER DUAL-MODE (NRZ/PAM4 12.8Gbps/25.6Gbps) SERIALIZER AND LASER DRIVER IN TSMC 65nm TECHNOLOGY

Praveen Gayatree Pendyala

Approved:

Chair of the Committee Dr. Jinghong Chen, Associate Professor, Electrical and Computer Engineering Department

Committee Members:

Dr. Shin-Shem Steven Pei, Professor, Electrical and Computer Engineering Department

Dr. Jiming Peng, Associate Professor, Industrial Engineering Department

Dr. Suresh K. Khator, Associate Dean, Cullen College of Engineering Dr. Badri Roysam, Professor and Chair, Electrical and Computer Engineering Department

# DESIGN AND SIMULATION OF HIGH SPEED LOW-POWER DUAL-MODE (NRZ/PAM4 12.8Gbps/25.6Gbps) SERIALIZER AND LASER DRIVER IN TSMC 65nm TECHNOLOGY

## An Abstract of a Thesis Presented to the Faculty of the Department of Electrical and Computer Engineering University of Houston

In Partial Fulfillment of the Requirements for the Degree Master of Science in Computer Systems Engineering

> by Praveen Gayatree Pendyala May 2017

## ABSTRACT

This thesis presents the design and simulation of the schematic of a low-power (5.6pJ/b) dual-mode (12.8 Gbps NRZ, 25.6 Gbps PAM4) serializer with driver to be used in high-speed serial link transmitter application-specific integrated circuit (ASIC) to be employed in High-Energy Physics (HEP) experiments.

The serializer and driver are being designed in a 65 nm CMOS technology. The ASIC itself will mainly include an LC-VCO phase-locked-loop (PLL), a 32:2 serializer and a CML driver. The driver also employs FIR pre-emphasis using a 2-bit programmable buffer delay chain. The serializer, driver and pre-emphasis are designed based on a combination of architectures presented in literature.

A VCSEL model based on literature is designed using cadence schematic. Verilog-A module is instantiated to emulate the non-linear optical low-pass filtering response of a VCSEL and electrical components are built to form the electrical part of the VCSEL model.

The schematic of the PAM-4/NRZ transmitter presented in this thesis is shown to have an energy efficiency of 5.6pJ/b (with serializer) and 3.71pJ/b (without serializer). Substantial improvements in vertical eye openings and jitter were recorded due to pre-emphasis.

vi

# **TABLE OF CONTENTS**

| ABSTRACTvi                     |
|--------------------------------|
| TABLE OF CONTENTSvii           |
| LIST OF FIGURESx               |
| LIST OF TABLESxii              |
| 1. INTRODUCTION                |
| 2. BACKGROUND                  |
| 2.1. High Speed Serial Link    |
| 2.2. NRZ/PAM4 Communication    |
| 2.2.1.NRZ (Non-Return-to-Zero) |
| 2.3.1. FFE                     |
| 2.5. PDK – TSMC 65NM           |
| 2.5.1. Resistors               |
| 2.5.2. MOSFETs                 |
| 2.6. PACKAGE PARASITICS        |
| 3. TRANSMITTER ARCHITECTURE    |
| 4. SERIALIZER                  |

|   | 4.1. | Introduction                               | 19 |
|---|------|--------------------------------------------|----|
|   | 4.2. | Current Mode Logic                         | 21 |
|   | 4.3. | Serializer Architecture                    |    |
|   | 4.4. | 16:4 and 4:2 CMOS MUXES                    | 25 |
|   | 4.4. | .1. Architecture                           |    |
|   | 4.4. | .2. Schematic                              | 27 |
|   | 4.4. | .3. Simulations                            |    |
|   | 4.5. | CML MUX                                    |    |
|   | 4.5. | .1. Architecture                           |    |
|   | 4.6. | CML Latch                                  |    |
|   | 4.6. | .1. Cadence Schematics                     |    |
| 5 | . CH | HERRY-HOOPER PRE-DRIVER                    | 36 |
|   | 5.1. | Cherry Hooper Architecture                 |    |
|   | 5.2. | Modified Cherry Hooper as a CML Pre-driver |    |
|   | 5.3. | Simulation and Results                     |    |
| 6 | . VC | CSEL                                       | 42 |
|   | 6.1. | Introduction                               |    |
|   | 6.2. | Static (DC) Response of VCSEL              | 44 |
|   | 6.3. | Dynamic (Frequency) Response of VCSEL      | 45 |
|   | 6.4. | VCSEL Modelling                            |    |
|   | 6.4. | .1. Electrical Model                       |    |
|   | 6.4. | .2. Optical Model                          | 51 |
|   | 6.5. | Other Considerations                       |    |

| 6.5.1.   | Turn-On Delay                          | 53 |
|----------|----------------------------------------|----|
| 6.5.2.   | Off-State Bounce in VCSEL              | 53 |
| 6.6. Cao | lence Modelling                        | 55 |
| 6.6.1.   | Optical Low Pass Filter Design         |    |
| 6.6.2.   | VCSEL Design                           | 56 |
| 7. DRIV  | /ER (PAM-4)                            | 57 |
| 7.1. Тој | p-Level Design                         | 57 |
| 7.2. Del | ау Тар                                 | 58 |
| 7.3. The | e Main Driver unit                     | 60 |
| 7.3.1.   | Cadence Schematics                     |    |
| 7.4. Pre | -emphasis Driver                       | 63 |
| 7.4.1.   | FIR Filter Equalization                | 63 |
| 7.5. Dri | ver Simulation                         | 66 |
| 8. CON   | CLUSION AND FUTURE WORK                | 73 |
| BIBLIO   | GRAPHY                                 | 75 |
| Appendix | A - VCSEL Frequency Response Testbench | 76 |

# **LIST OF FIGURES**

| Figure 2-1 The eye diagram and the transient response of (A) PAM-4 signal and (B) NRZ signal 4                                         |
|----------------------------------------------------------------------------------------------------------------------------------------|
| Figure 2-2 Fully equalized communication link                                                                                          |
| Figure 2-3 Feed Forward Equalizer - Delay tap architecture                                                                             |
| Figure 2-4 Classic Decision Feedback Algorithm                                                                                         |
| Figure 2-5 Jitter sensitivity on maximum data rate of RX and TX for (a) 3-Tap Pre-emphasis on TX only and (b) RX and TX 16 tap DFE11   |
| Figure 2-6 Unifying Clock and Data terminologies                                                                                       |
| Figure 2-7 The Cross-section view of QFN package                                                                                       |
| Figure 2-8 (a) Three-dimensional structures of QFN-48 package and (b) Equivalent circuit model for analytical modeling of QFN packages |
| Figure 3-1 Complete Architecture of Transmitter                                                                                        |
| Figure 4-1 Simple 2X1 MUX                                                                                                              |
| Figure 4-2 Simple CML buffer architecture                                                                                              |
| Figure 4-3 Complete Architecture of Serializer                                                                                         |
| Figure 4-4 Architecture of CMOS 2X1 half-rate multiplexed MUX                                                                          |
| Figure 4-5 D-Flip flop implementation using inverters and switches                                                                     |
| Figure 4-6 The architecture of XOR gate using transitors                                                                               |
| Figure 4-7 Cadence schematic of CMOS 2x1 MUX                                                                                           |
| Figure 4-8 Cadence Schematic of CMOS D-Flip Flop                                                                                       |
| Figure 4-9 Cadence Schematic of a CMOS Inverter                                                                                        |
| Figure 4-10 Cadence schematic of a Tri-state inverter                                                                                  |
| Figure 4-11 Cadence Schematic of a NAND2 gate                                                                                          |
| Figure 4-12 Cadence Schematic of CMOS XOR gate                                                                                         |
| Figure 4-13 The output waveforms of the various stages in the CMOS serializer                                                          |
| Figure 4-14 2:1 CML MUX Architecture                                                                                                   |
| Figure 4-15 Cadence Schematic of a CML Latch                                                                                           |
| Figure 4-16 Cadence Schematic of a CML MUX                                                                                             |
| Figure 4-17 Cadence Schematic of the 2 to 1 CML serializer                                                                             |
| Figure 5-1 Cherry-Hooper Architecture                                                                                                  |
| Figure 5-2 Modified Cherry Hooper Architecture for Pre-driver                                                                          |
| Figure 5-3 Cadence Schematic of the 2 stage Cherry-Hooper Pre-driver                                                                   |
| Figure 5-4 Cadence Simulations of the various CML stages                                                                               |

| Figure 6-1 The physical cross section of the semiconductor VCSEL                                                                                       | 42                |
|--------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------|
| Figure 6-2 Static Characteristics of VCSEL P v/s I                                                                                                     | 45                |
| Figure 6-3 Dynamic Characteristics of VCSEL – Frequency Response                                                                                       | 48                |
| Figure 6-4 Electrical Model of 850nm VCSEL for simulations                                                                                             | 50                |
| Figure 6-5 Optical Model of the VCSEL                                                                                                                  | 51                |
| Figure 6-6 Complete VCSEL model to be used in simulations                                                                                              | 53                |
| Figure 6-7 Optical response of VCSEL                                                                                                                   | 54                |
| Figure 6-8 Cadence Symbol for Optical LPF emulation in the VCSEL                                                                                       | 56                |
| Figure 6-9 Cadence Schematic of the VCSEL Model (both static and dynamic)                                                                              | 56                |
| Figure 6-10 Cadence symbol for the VCSEL model taking in Current (mA) and emulating ou power (mW) as a Voltage                                         | tput<br>56        |
| Figure 7-1 The top-level block diagram of the Driver and Cherry Hooper Pre-driver                                                                      | 57                |
| Figure 7-2 Cadence schematic of CML buffer in TSMC65nm Technology                                                                                      | 58                |
| Figure 7-3 Cadence schematic of CML buffer 'Delay-Cell' in TSMC65nm Technology                                                                         | 59                |
| Figure 7-4 Cadence schematic of CML 'Delay-Tap Chain' in TSMC65nm Technology                                                                           | 59                |
| Figure 7-5 Cadence simulation of the transient response of the Delay Tap                                                                               | 60                |
| Figure 7-6 Main PAM-4 Driver Architecture                                                                                                              | 61                |
| Figure 7-7 Cadence Schematic of the Main Driver Unit                                                                                                   | 62                |
| Figure 7-8 The current waveform before and after FFE pre-emphasis                                                                                      | 64                |
| Figure 7-9 Change in high pass response of the FIR filter for different fractional delays <b>n0</b>                                                    | 65                |
| Figure 7-10 Driver module with delay blocks for pre-emphasis                                                                                           | 66                |
| Figure 7-12 Cadence schematic of the PAM-4 Driver with pre-emphasis driver                                                                             | 67                |
| Figure 7-13Cadence Schematic of Full transmitter test bench with VCSEL                                                                                 | 68                |
| Figure 7-13 Eye diagram of NRZ Driver with (a) and without pre-emphasis (b) $tdel = 28$ (c) $tdel = 36 ps$ (d) $tdel = 44 ps$                          | <b>ps</b><br>69   |
| Figure 7-15 Eye diagram of PAM-4 Driver with (a) and without pre-emphasis (b) $tdel = 20$ (c) $tdel = 28 ps$ (d) $tdel = 36 ps$ and (e) $tdel = 44 ps$ | <b>) ps</b><br>70 |
| Figure A-1 VCSEL AC Analysis Testbench                                                                                                                 | 76                |
| Figure A-2 Frequency Response of VCSEL Output Power                                                                                                    | 77                |

# LIST OF TABLES

| Table 2-1 Diffusion Resistors Available in PDK                         | . 14 |
|------------------------------------------------------------------------|------|
| Table 2-2 Poly resistors Available in the PDK                          | . 14 |
| Table 2-3 N-Well Resistors Available with PDK                          | . 14 |
| Table 2-4 MOSFETs available with the PDK                               | . 15 |
| Table 2-5 QFN Package Parasitic values used in design                  | . 17 |
| Table 6-1 Parameters used for describing VCSEL in optical model        | . 52 |
| Table 7-1 The delays introduced by the buffer delay tap selection bits | . 58 |
| Table 7-2 NRZ Pre-emphasis performance improvements                    | .71  |
| Table 7-3 PAM-4 Pre-emphasis performance improvements                  | .71  |
| Table 7-4 Power consumption of the various components in the design    | .72  |

## 1. INTRODUCTION

In modern data centers, one major contributor to the expenses is the energy spent on powering the interconnected networks that typically cover an area of a few hundred meters. In the past, copper wires have been the main means of interconnecting. However, as the demand on communication speed rose, the power and cooling demand of the interconnect framework became a major concern. As a result, optical fiber networks are gradually taking over the field, providing improved performance at lower energy demands. Therefore, a new generation of laser driver circuits based on modern technology nodes is required to keep up with the increasing demand in speed while maintaining high power efficiency. The migration on smaller feature size offers reduced power consumption, faster performance and smaller area on the CMOS level which in turn can be utilized to design simpler driver circuits with the same or even better performance.

## 2. BACKGROUND

#### 2.1. High Speed Serial Link

The global demand for data-rate puts increasing demands on computation and information routing on data centers. Since power dissipated in the electronics is a major contributor to the energy consumption of these data centers, the overall energy efficiency of the communication system needs to be investigated. Furthermore, power is directly related to cost, therefore, energy efficiency can be used to assess the cost of the data-rate. The energy efficiency of a system is defined as the energy spent per bit of information transmitted or received. Energy efficiency is measured in *pJ* per bit  $\left(\frac{pJ}{b}\right)$  or  $\left(\frac{mW}{\frac{Gb}{s}}\right)$  and it

is typically annotated with the symbol  $\eta_{eff}$ .

In communication links within a data center, optical interconnects are overtaking their copper counterparts since they are practically loss-less for the short distances in a LAN and they offer better performance vs energy trade-offs. According to the ITRS the optical interconnect technology must reach 1 pJ/b energy efficiencies, for the complete link, by 2020. With the current technologies achieving efficiencies of about 50 - 15 pJ/bit is evident that there is a long way to go to meet the predictions of the ITRS (1).

The energy efficiency of an optical link is a rather complex issue to tackle since it reflects an equally complex system, which can fortunately be broken down in separate subproblems or subsystems. If we follow the flow of information within an optoelectronic link it is easy to see that complexity. The data initially stored in a digital format are first encoded, then transferred to the light emitter through on-chip interconnects where they are transformed into optical pulses, then travel through optical fiber until they reach the optical receiver only to be transformed again into electrical pulses to be decoded and stored or further forwarded.

The first distinction that can be made, is between the different media where the information is traveling through. We have three subsystems (the transmitter, the optical medium and the receiver) that need to be optimized individually as well as collectively to achieve the best energy efficiency. Furthermore, on both the transmitter and the receiver, there is an optoelectronic conversion; from the encoder, to the driver electronics and then to the light emitter or from the photo-receiver to the electronics and finally to the decoder. The respective efficiencies of those conversions need to be considered and optimized for the best trade-offs between data-rate and energy consumption. Even though substantial research is needed on all three fields, i.e. the transmitter circuit, the optical medium and the receiver circuit, this project is focused on the transmitter only.

### 2.2.NRZ/PAM4 Communication

Ever-increasing demands for high speed data transmission with low power technology continues to drive VCSEL laser driver transmission innovation. Technological advances towards achieving greater speed presents two design possibilities, NRZ and PAM-4, and each comes with a unique set of challenges.



Figure 2-1 The eye diagram and the transient response of (A) PAM-4 signal and (B) NRZ signal

#### 2.2.1. NRZ (Non-Return-to-Zero)

NRZ (Non-Return-to-Zero) in an available technology and will continue a linear evolution from 100G (12.8 Gbps, 8 channels) to 400G (25.6 Gbps, 16 channels). From a time-domain perspective, NRZ consists of 1's and 0's and can be referred to as PAM-2 (pulse amplitude modulation, 2-level) for the two amplitude levels that contain 1 bit of information in every symbol.

The NRZ eye diagram (Figure 2-1(B)), providing timing and voltage used to measure link performance, contains a single eye.

The areas of challenges for NRZ communication:

- shorter unit intervals (UI) or smaller horizontal eye opening
- totally closed eyes,
- tighter jitter requirements, and
- the mandatory use for forward error correction (FEC) like FFT Equalization.

Higher frequencies have higher channel loss, so high speed transmission requires enhanced receiver equalization such as continuous-time-linear equalization (CTLE) and decision feedback equalization (DFE) to correct.

Standards are requiring increased receiver sensitivity (down to 50 mV). Jitter budgets are even tighter for 400G at 17ps UI and may be below intrinsic jitter of test equipment. FEC, a technique used for controlling errors in data transmission over unreliable or noisy communication channels, becomes a greater challenge with increased noise at the faster data rate. For the most part, channel loss and reflections (noise) are expected to be the biggest NRZ technological challenge as it continues its linear growth path.

#### 2.2.2. PAM-4 (Pulse Amplitude Modulation, 4-level)

PAM-4 (Pulse Amplitude Modulation, 4-level) from a time domain perspective has four digital amplitude levels (-3, -1, 1, and 3). From a frequency domain perspective, PAM-4 requires half the bandwidth of that of NRZ.

PAM-4 has an advantage over NRZ in that for each level ("symbol") there are 2 bits of information providing twice as much throughput for the same Baud rate (12.8Gbps GBaud PAM-4 = 25.6 Gb/s).

In the PAM-4 eye view (Figure 1(A)), you can see 3 vertical eyes created by the 4 levels. Unlike NRZ, where the decision level is fixed to 0 V for a differential signal, the three slicer levels used by a PAM-4 receiver can be adaptive, or time varying. The use of multi-level signaling for PAM-4 has entirely changed what has been expected in Ethernet test in the past.

Newly developed technology is required to accomplish implementation of PAM-4 components and serial links with changes to system test as more complex transmit and receive circuit designs are required to address PAM-4 challenges. PAM-4 technical challenges include a shift from saturating output stages to linear IO behavior in order to achieve multiple levels. New chip designs face the challenge of managing the size of the integrated circuits (ICs) supporting PAM-4 which have increased nearly 30%. The larger IC size is due to additional linear drivers and detectors and has resulted in up to 35% increase in power requirements as well. For Ethernet chips with large IO counts, the problems are even greater.

PAM4 Design and Test offers many challenges like (2):

**Clock Recovery**. Finite rise time acting on different transition amplitudes creates inherent inter symbol interference (ISI) and makes clock recovery much more difficult. Transition time of the PAM-4 data signal can create significant horizontal eye closure due to switching jitter, which is dependent on the rise and fall time of the signal. Transition qualified phase detectors are needed to look at analog levels for clock recovery. Whether direct detection (comparators), which require a lot of power, or digitizing ADCs, which are expensive, are used is still to be determined.

**Decision Feedback Equalization (DFE)**. DFE is used to calculate a correction value that is added to the logical decision threshold and results in the threshold shifting up or down so new logical decisions can be made on the waveform based upon the new equalized threshold level. The technology required to manage DFE and the possible addition or use of other equalizers for PAM-4 multi-levels is still to be determined.

**Loss of Signal to Noise Ratio (SNR)**. The PAM-4 signal has 1/3 the amplitude of that of a similar NRZ signal (SNR loss of ~9.5 dB) due to level spacing and is more susceptible to noise. However, it is possible that the lower PAM-4 insertion loss compensates for the 9.5dB loss in SNR due to reduced signal amplitude in PAM-4 signaling.



2.3. Equalization (DFE FFE)

Figure 2-2 Fully equalized communication link

Figure 2-2 shows the fully equalized communication link. As can be seen, the link requires communication both at the transmitter side (TX) and at the receiver side (RX). At the transmission side, FFE or Feed Forward Equalization (FFE) is used to cancel the known non-linearity of the channel. At the receiver end, a CDR recovers the clock from the signal

and samples the received data into Decision Feedback Equalizer (DFE), which predicts the next sample and corrects its ISI, thereby reducing jitter.

#### 2.3.1. FFE



Figure 2-3 Feed Forward Equalizer - Delay tap architecture

In a feed forward filter, the input signal is delayed and the delay samples are linearly combined with different weight to produce a signal which can withstand the channel distortion. The most common feed forward filter is the FFE delay tap shown in Figure 2-3. Although many taps are shown here (and are theoretically required to cancel the effect of the channel distortions), more taps lead to more area and power during implementation. So, a tradeoff is achieved between the circuit speed and circuit complexity. Generally, one or two delay taps are enough to cancel the ISI effects of the channel.

Usually, a FFE is sampled at the symbol rate (*T*). Feed forward filter is also often sampled at 2 or 3 times symbol-rate — fractionally-spaced (i.e. sampled at T/3 or at T/2). Fractional sampling allows the matched filter to be realized digitally and adapt for channel variations (not possible in symbol-rate sampling). It also allows for simpler timing recovery schemes (FFE can take care of phase recovery) 2.3.2. **DFE** 



Figure 2-4 Classic Decision Feedback Algorithm

The Decision feedback equalizer is used in properly recovering the data from harmful ISI at the receiver. The classic structure of the Decision Feedback Equalizer is shown in Figure 2-4.

The Slicer block makes a symbol decision, i.e. quantizes the input. This decision is fed back into a feedback FIR filter. This feedback outputs an ISI that is to be directly subtracted from the incoming signal. As the name rightly suggests, decision feedback is based on deciding the current symbol based on many previous symbols.

### 2.4. Clocking

Many breakthroughs in circuit design and technology have resulted in a in the widespread adoption of multi-gigabit data links. At higher speeds, improved clocking methods play a significant role.

Many breakthroughs in circuit design and technology have resulted in a in the widespread adoption of multi-gigabit data links. At higher speeds, improved clocking methods play a significant role.

Past methods of data link clocking such as common clock or synchronous architectures relied on a central clocking source as the synchronization signal to retime the outbound data at the transmitter (TX) as well as sample the input data at the receiver (RX).

The maximum data rate of these legacy topologies was severely limited by clock interconnect bandwidth, uncompensated clock skew, channel latency, and jitter. The maximum data rates have increased beyond 10Gbps by the introduction of clocking methods such as (3):

- clock multiplication,
- forwarded clock recovery
- embedded clock recovery
- per-pin de-skew
- jitter filtering, and
- duty-cycle error correction

Well-designed multi-Gb/s data link architectures employ a combination of advanced equalization and precise clocking to balance the goals of performance, power efficiency, and cost. Architectures overly focused on equalization of the channel and not sufficiently optimized for clock quality may suffer from suboptimum tradeoffs between complexity, power efficiency and data rate.



Figure 2-5 Jitter sensitivity on maximum data rate of RX and TX for (a) 3-Tap Pre-emphasis on TX only and (b) RX and TX 16 tap DFE

Figure 2-5 demonstrates the high sensitivity of jitter on maximum data rate using TX only equalization (3 tap pre-emphasis) or TX and RX equalization (16 tap DFE). This example demonstrates that jitter at the TX or RX has a significant impact on maximum achievable data rates. In this case, the maximum data rate is as sensitive to high-frequency TX jitter as it is to equalization choice or RX sampling jitter. Practically, RX jitter also impacts I/O performance since long term RX sampling jitter magnitudes usually exceed that of the TX high frequency jitter. Therefore, it is essential that a proportional amount of design effort as well as power budget is focused on methods to reduce both TX and RX jitter magnitude.

Over the last few decades, process technology scaling has benefitted high-speed data link interfaces by providing increased transistor bandwidth, greater density and enhanced functionality. However, microprocessor interface design is facing significant challenges, partly due to the need for integration on process technology optimized for digital functionality. Since a substantial portion of microprocessor silicon area is dominated by digital functionality and is usually optimized for cost, there is a tendency to limit the process feature set used for analog and mixed-signal functionality.

In many cases, modern data link interfaces have to be designed with poor quality resistors, capacitors, and inductors while the transistors have suboptimum analog characteristics in terms of output impedance or matching. (3)

In summary, historical scaling of transistor density has benefitted I/O bandwidth by enabling wider interfaces, more advanced equalization and higher data rates; however, the challenges associated with increased integration and bandwidth demands are causing ever more challenging conditions to optimize clock quality. The intent of this paper is to introduce the fundamentals required for high-speed data link clocking design and analysis.

The most commonly implemented clock architectures required for synthesis, distribution and recovery of I/O clocks are forwarded clock, embedded clock, and their corresponding variants. Figure 2-6 gives the clock half cycle and data unit interval definitions.

12



Figure 2-6 Unifying Clock and Data terminologies

### 2.5.PDK – TSMC 65NM

For this design, we are designing the system in TSMC 65NM CMOS Mixed Signal RF SALICIDE Low-K IMD 1P6M-1P9M PDK (CRN65LP). The available voltages for the PDK are 1.2V / 2.5V / 2.5V under-drive 1.8V / 2.5V over-drive 3.3V. For our design, we are using the CORE 1.2V power supply which has  $V_{SS}$  as 0V or ground.

#### 2.5.1. Resistors

There are a variety of resistors in the PDK. The common ones are Diffusion, Poly and Well Resistors. These are shown in Table 2-1, Table 2-2 and Table 2-3. Diffusion resistors are made with n- or p- diffusions in the substrate. Poly resistors are made with the poly layer. Poly resistors tend to have much higher sheet resistivity than diffusion resistors. Also, addition of salicide reduces the sheet resistance of materials.

Diffusion Resistors Available:

| MODEL  | DESCRIPTION                            |  |
|--------|----------------------------------------|--|
| rnod   | N+ diffusion resistor with salicide    |  |
| rnodwo | N+ diffusion resistor without salicide |  |
| rpod   | P+ diffusion resistor with salicide    |  |
| rpodwo | P+ diffusion resistor without salicide |  |

#### Table 2-1 Diffusion Resistors Available in PDK

Poly Resistors Available:

### Table 2-2 Poly resistors Available in the PDK

| MODEL    | DESCRIPTION                       |  |
|----------|-----------------------------------|--|
| rnpoly   | N+ poly resistor with salicide    |  |
| rnpolywo | N+ poly resistor without salicide |  |
| rppoly   | P+ poly resistor with salicide    |  |
| rppolywo | P+ poly resistor without salicide |  |

N-Well Resistors Available:

| Fable 2-3 N-Well | Resistors | Available | with PDK |
|------------------|-----------|-----------|----------|
|------------------|-----------|-----------|----------|

| MODEL  | DESCRIPTION               |  |
|--------|---------------------------|--|
| rnwod  | N-Well resistor under OD  |  |
| rnwsti | N-Well resistor under STI |  |

For the relatively low resistances used at high speed operation, we choose *rppolywo* as the resistance for the CML drivers and *rppoly* as the resistor for maching.

## 2.5.2. **MOSFETs**

There are a variety of MOSFETS available with the PDK. These are shown in Table 2-4.

| MODEL                              | DESCRIPTION                           |
|------------------------------------|---------------------------------------|
| nch, pch                           | CORE standard $V_T$ devices           |
| nch_dnw, pch_dnw                   | CORE standard $V_T$ in DNW transistor |
| nch_lvt, nch_hvt, pch_lvt, pch_hvt | Low $V_T$ and High $V_T$ versions     |
| nch_25, pch_25, nch_33, pch_33     | 2.5V and 3.3V nominal $V_T$ versions  |
| nch_25od33, pch_25od33             | 2.5V over drive 3.3V versions         |
| nch_25ud18, pch_25ud18             | 2.5V under drive 1.8V versions        |

Table 2-4 MOSFETs available with the PDK

We use the nch – Core standard  $V_T$  device for all our schematics.

### 2.6.PACKAGE PARASITICS



Figure 2-7 The Cross-section view of QFN package

The quad flat no-lead (QFN) packages are not only chip scale packages (CSPs) but also plastic encapsulated lead-frame packages. Figure 2-7 shows the cross-sectional view of a QFN package. The leads of the QFN package are on the bottom side of the package. The bottom-lead package structure can provide good electrical interconnection to the printed circuit board (PCB). In addition, the chip-scale QFN packages can have much smaller size than conventional packages such as thin quad flat packages (TQFPs) and thin shrink small outline packages (TSSOPs). Both the compact package size and encapsulated lead frames lead to reduction in the parasitic characteristics of the packages. The advantages of the QFN packages include miniaturized size and good electrical performance as well as excellent thermal characteristics (4). The QFN package 3D view is shown in Figure 2-8 (a).



Figure 2-8 (a) Three-dimensional structures of QFN-48 package and (b) Equivalent circuit model for analytical modeling of QFN packages

The QFN package parasitic values are estimated in (4) based on EM characterizations. We use the model shown in Figure 2-8 (b) to represent the QFN port of our laser driver and use the values provided in Table 2-5 for simulations.

| PARAMETER             | VALUE           |
|-----------------------|-----------------|
| <i>C</i> <sub>1</sub> | 61 <i>f F</i>   |
| C <sub>12</sub>       | 62.74 <i>fF</i> |
| <i>C</i> <sub>2</sub> | 26.18 <i>fF</i> |
| <i>L</i> <sub>1</sub> | 1.216 nH        |
| L <sub>2</sub>        | 1.274 nH        |
| L <sub>12</sub>       | 80.32 <i>pH</i> |
| R <sub>1</sub>        | 1.205 Ω         |
| <i>R</i> <sub>2</sub> | 1.241 Ω         |
| $R_g$                 | 0.019 Ω         |

Table 2-5 QFN Package Parasitic values used in design

## **3. TRANSMITTER ARCHITECTURE**



Figure 3-1 Complete Architecture of Transmitter

The transmitter is made up of multiple parts. The first part is the 32:2 Serializer which collects 32 data banks of data (when in PAM-4 mode) or 16 banks of data (when in NRZ mode) at 800 Mbps per bank and converts them to two streams of 12.8 Gbps. These are then pre-amplified using one cheery Hooper pre-driver for each stream and fed into the Driver unit. The driver unit is made up of two delay blocks to delay each data stream. The delayed data is fed into is fed into the pre-emphasis driver. The two main data streams are fed into a PAM-4 Main Driver. The PAM-4 Driver and the delayed pre-emphasis drivers drive current into the VCSEL, which is common-anode connected to the drivers. The complete architecture of the Transmitter is shown in Figure 2-1. In the subsequent chapters, the design of each component in the architecture will be discussed in detail.

## 4. SERIALIZER

#### 4.1.Introduction

Data is collected at the transmitter. Generally, there is are continuous data streams from multiple sources, each at a speed of around a few hundred Mb/s. When transmitting data over a high-speed link, there is a need for time-multiplexing the data from various sources into a single stream of high speed data. This time-multiplexing is called serializing. To achieve this on silicon, we need to have proper circuitry for a high-speed serializer. At the VCSEL end of the transmitter, we are trying to achieve a data speed of 12.5Gbps NRZ (25 Gbps PAM4).

A conventional MUX is a circuit which allows one of multiple parallel inputs out based on a selector switch. A simple 2X1 MUX is shown in Figure 4-1.



Figure 4-1 Simple 2X1 MUX

When the SEL input to the MUX is high, it outputs data in CH1. When SEL input to the MUX is low, it outputs data in CH0. A serializer is a MUX which works based on a clock. The clock is a continuous stream of alternating highs and lows. When you connect the clock to the SEL input to a MUX, the output of the MUX is alternating data between CH0 and CH1. If the data in CH0 and CH1 run at 1 bit per clock cycle, we can have the output to be a serialized (alternated) collection of the entire data of CH0 and CH1. The speed of the combined data at the output of the serializer would be twice the speed of each individual channel's (CH0 and CH1) data.

Typical MUX/Serializer is constructed out of CMOS gates as shown in the picture. Unfortunately, with CMOS gates, there is a limitation on the speed that can be achieved. The first few stages of the Multiplexer can be completed by using a CMOS rendition of gates, but the latter stages need to be constructed based on more high-speed alternatives.

The main reason for speed limitation is the inability to charge the capacitor at the end of each gate's output. If the input data changes faster than the time required for the output to show up on the capacitor, a phenomenon called Intersymbol Interference (ISI) occurs. ISI is a form of distortion of a signal in which one symbol interferes with subsequent symbols. This is an unwanted phenomenon as the previous symbols have similar effect as noise, thus making the communication less reliable. The spreading of the pulse beyond its allotted time interval causes it to interfere with neighboring pulses.

The simple charge/discharge time of the capacitor is given by (4-1):

$$\Delta t = \frac{C\Delta V}{I}.\tag{4-1}$$

Here, *I* is the current which flows through the capacitor, *C* is the capacitance at the output stage of each capacitor and  $\Delta V$  is the voltage change on the capacitor. In CMOS logic, the voltage runs from rail to rail (0 to V<sub>DD</sub>). The current passing through the CMOS transistors is directly proportional to the size of the transistors. Thus, to attempt to reduce the time of charging, if we increase current in the transistors, we need to increase the size of the transistors which in-turn increases the capacitance produced by the gates of the

transistors. This causes C to also increase, thereby limiting the minimum  $\Delta t$ . This causes the speed limitation in CMOS topologies.

#### 4.2. Current Mode Logic

To circumvent this problem, we need to use a different kind of topology called Current Mode Logic (CML) or Source Coupled Logic (SCL).

A simple CML or SCL inverter is made up of an NMOS source-coupled pair ( $M_P$  and  $M_N$ ) having transistors working in the saturation or cut off region, that approximate well the behavior of a voltage-controlled current switch. The biasing current ( $I_{TAIL}$ ) is steered to one of the two output branches and converted into a differential output voltage by two identical loads ( $R_P$  and  $R_N$ ) connected between the supply voltage and the drain of each NMOS transistor. The CML structure is a differential input, differential output structure, and inputs both the data (A) and its complement ( $\overline{A}$ ). The output is also differential (O and  $\overline{O}$ ). A simple CML buffer is shown in Figure 4-2.



Figure 4-2 Simple CML buffer architecture

The main advantage of using such a structure is that the current flowing through the output load capacitance (input capacitance of the next stage) can be controlled using an external current source ( $I_{TAIL}$ ) and the expression (4-1) changes to the expression (4-2):

$$\Delta t = \frac{C\Delta V}{I_{TAIL}}.$$
(4-2)

It can be clearly seen that  $\Delta t$  can be reduced further compared to a CMOS alternative by using a higher tail current. Additionally, this structure provides better immunity towards noise, since noise created by variations in the power supply or coupled from external sources tends to be a common-mode signal that affect both differential signals similarly, leaving the difference value unchanged.

Half-rate multiplexing scheme is adopted in the serializer design to save power consumption. Half-rate multiplexing is that it operates at both clock edges and provides twice data rate at twice the frequency of the clock. With half-rate multiplexing, though, any duty cycle mismatches will have a direct impact on the deterministic jitter at the serializer output. Hence, to suppress the offset and its associated duty cycle distortion, besides careful layout, AC-coupling and a duty cycle corrector (DCC) based on a programmable 5-bit IDAC need to be employed in the clock buffer circuit. The IDAC adjusts the gate voltage (and thus V<sub>GS</sub>-V<sub>TH</sub>) of the PMOS and NMOS transistors of the clock buffer to effectively correct the duty cycle error.

#### 4.3. Serializer Architecture

Figure 4-3 shows the complete top level diagram of the serializer. The entire serializer circuit is made up of two 16:1 parallel stages. Each stage takes inputs from 16 parallel input

streams providing data at the relatively low speed of 800 Mbps. The serializing circuit has a 3-stage binary-tree structure, namely 16:4, 4:2 and 2:1 multiplexers in a chain. The 3 multiplexer stages are driven by 0.8 and 1.6 GHz clocks for the first stage, 3.2 GHz clock for the second stage and 6.4 GHz clocks for the final stage, respectively.

Transmission gate switches control the selection of one of the two 16:4 MUXes to the following stages, in turn, providing NRZ/PAM 4 selection at the output stages.

Two logic styles are employed in the serializer design. Static CMOS gates are used in the 16:8 and 8:4 multiplexers and the shift registers because of their high density and low power consumption. Current-mode-logic (CML) gates are used in the 4:2 and 2:1 multiplexers due to their higher speed operation and lower supply sensitivity characteristics.

The output of the last MUX stage drives a large capacitive load of the driver input (the driver passes large current into the VCSEL – of the order of several milliamperes). This large capacitance can cause delays at the driver.

To circumvent this problem, the last MUX stage outputs a complementary data signal to a CML pre-amplifier, which provides enough drivability to driver the PAM4/ NRZ CML Laser Driver. The VCSEL is connected in the common anode configuration. We have an external 3V power supply to the VCSEL anode.

## **32:2 SERIALIZER**



Figure 4-3 Complete Architecture of Serializer
### 4.4. 16:4 and 4:2 CMOS MUXES

### 4.4.1. Architecture

The half-rate multiplexed serializer in CMOS is shown in Figure 4-4.



Figure 4-4 Architecture of CMOS 2X1 half-rate multiplexed MUX

The input data rate and the clock frequency are the same to use half-rate multiplexing in the above structure. Two dynamic (using inverters) CMOS D-Flip flops are used to align the input data with the clock. Followed by this, we the general MUX structure discussed above and connect the CLK to the SEL input (buffered to one, inverted to other). This makes the data to be sampled at both the rising edge and the falling edge of the clock. Finally, we add the two outputs of the NAND gates to the XOR gate, which serializes the parallel inputs. The output data rate of the serializer is 2 times the data rate of the parallel inputs. Each CMOS D-Flip Flop is made using switches and inverters and is shown in Figure 4-5.



Figure 4-5 D-Flip flop implementation using inverters and switches.

The switches are realized using tri-state inverters (with an Enable input). The NAND gates and XOR gates are in standard CMOS logic to provide low power computation at the input stages of the serializer.

The XOR gate is constructed using transmission gates and realized as in Figure 4-6.



Figure 4-6 The architecture of XOR gate using transitors

The NAND gates, inverter and the buffer are constructed in standard CMOS process.

## 4.4.2. Schematic

The CMOS 2X1 MUX is shown in Figure 4-7. The D-Flip Flop is shown in Figure 4-8.



27





Each DFF and MUX is made up of NAND gates, Inverters and XOR gates. They are shown in the schematics below:



Figure 4-9 Cadence Schematic of a CMOS Inverter



Figure 4-10 Cadence schematic of a Tri-state inverter



Figure 4-11 Cadence Schematic of a NAND2 gate



Figure 4-12 Cadence Schematic of CMOS XOR gate

## 4.4.3. Simulations

**Transient Response** 

The CMOS 32:4 Serializer can only generate speeds up to 3.2Gbps. The simulations are shown in Figure 4-13.



Figure 4-13 The output waveforms of the various stages in the CMOS serializer

#### 4.5.CML MUX

#### 4.5.1. Architecture

A 2: 1 CML MUX is shown in Figure 4-14



Figure 4-14 2:1 CML MUX Architecture

This structure allows A when CLK is high (1.2V), and B when CLK is low (600mV). The bias voltage controls the output voltage swing. This is a half rate multiplexing structure to serialize A and B. For correct serialization of the data in A and B, two critical points need to be taken into consideration. Firstly, both A and B must have the same data rate as the Clock Frequency. Also, A must be aligned with CLK and B with CLK'. The latter is achieved by using master and slave CML latches as shown in the structure below:

In the above structure, the CML A and B are first aligned with the positive edge of CLK and B is further shifted by half a clock pulse. This causes  $A_out$  to be aligned with

the positive edge of the CLK and *B*\_**out** to be aligned with the negative edge of CLK. The signals *A*\_**out** and *B*\_**out** are fed into the CML MUX along with the CLK.

## 4.6.CML Latch

Each master and slave latches to align the data before passing into the MUX gate are designed using CML topologies for higher speed of operation. Each CML Latch has the following CML structure:

A bias current controls the voltage swing at the output. The resistance needs to be small enough to provide small rise and fall times to the inputs of the next stage. When the CLK is HIGH (1.2V) the device becomes transparent to the input A, when CLK is LOW (600mV), the device outputs the latest value of A before the CLK went low.

#### 4.6.1. Cadence Schematics

The CML Latch is modelled as



Figure 4-15 Cadence Schematic of a CML Latch

# The CML MUX can be modelled as



Figure 4-16 Cadence Schematic of a CML MUX



Figure 4-17 Cadence Schematic of the 2 to 1 CML serializer

## **5. CHERRY-HOOPER PRE-DRIVER**

The output of the serializer has a CML signal which is not fast enough to drive the driver directly because of the large input capacitance of the driver stage. To help drive this higher capacitance, a pre-driver (pre-amplifier) needs to be connected between the output of the serializer and the Laser driver.

The simplest way of building a CML pre-driver would be to connect cascading CML buffers with increasing bandwidth to shift the dominant pole farther away and accommodate high frequency signals. This is also called the CML Super-buffer. Although doing so would achieve the require drivability to drive the Laser Driver, this would become really power hungry.

To drive with large speeds the higher capacitance at the output, each successive CML structure needs to have a smaller load resistance than its predecessor. To maintain enough output voltage swing at each node of the super-buffer, we need to have a higher current in each successful buffer of the super-buffer. This higher current not only consumes a large amount of power, but also requires the devices to be large in dimensions.

## 5.1. Cherry Hooper Architecture

The Cherry-Hooper topology is widely used to provide broadband characteristics with high gain. Figure 5.6 (a) shows a schematic of the circuit topology which is used as a small signal amplifier.



Figure 5-1 Cherry-Hooper Architecture

The first differential pair acts as a transconductance stage that converts the input voltage signal into a current. The current-mode signal then is amplified and converted back into voltage by a transimpedance stage. The shunt feedback resistor lowers the input impedance of the transimpedance stage, and provides excellent high-frequency performances. The input impedance of the transimpedance stage becomes (5-1):

$$Z_{in} = \frac{V_{in}}{I_{in}} = \frac{R_1 + R_2}{1 + g_{m1}R_1}.$$
(5-1)

The low-frequency gain is calculated as

$$A_{v} = \frac{V_{out}}{V_{in}} = \frac{g_{m3}R_{1}(g_{m1}R_{2}-1)}{1+g_{m1}R_{1}} = \frac{g_{m3}g_{m1}R_{1}R_{2}}{1+g_{m1}R_{1}} - \frac{g_{m3}R_{1}}{1+g_{m1}R_{1}}.$$
(5-2)

If  $g_{m1}R_1 >> 1$  and  $R_2 >> (g_{m1})^{-1}$  then, the gain in (5-2) becomes

$$A_{\nu} \approx g_{m3}R_2 - \frac{g_{m3}}{g_{m1}} \approx g_{m3}R_2.$$
 (5-3)

The gain is equal to that of a simple common source (CS) stage having a load resistance of  $R_2$ . The pole frequencies of this circuit can be considered approximately as  $\omega_{p1} \approx g_{m3}/C_3$  and  $\omega_{p2} \approx g_{m1}/C_2$ , much higher than those of a CS stage circuit,  $(R_2C)^{-1}$ . Thus, this topology provides a higher frequency poles for the same voltage gain as a regular differential amplifier, thereby providing increased bandwidth.

## 5.2. Modified Cherry Hooper as a CML Pre-driver

As we are using current mode logic in a high speed digital link, we would use a modified Cherry Hooper CML Pre-driver which is shown below:



Figure 5-2 Modified Cherry Hooper Architecture for Pre-driver

 $I_{BIAS2}$  and  $I_{BIAS1}$  are the two current sources through two regular CML buffers. The two buffers are connected in the Cherry Hooper fashion shown above. The differential input to this Cherry Hooper buffer is provided to the gates of  $M_3$  and  $M_4$ . At the drain of  $M_3$  and  $M_4$ , an inverted signal is formed with a higher swing which in turn drives the gates of  $M_1$  and  $M_2$  respectively.

The next stage sees a smaller input resistance; hence, this buffer can be used to drive large capacitive loads like the laser driver. Also, the input capacitance of this buffer is low allowing the previous serializer stages to easily interface with the driver. This approach drastically reduces the stages of successive amplification, while also requiring lesser sizes for devices due to available headroom. This causes remarkable improvement in the power consumption and the area requirements for the pre-driver in comparison to a regular cascaded CML super-buffer.

### 5.3. Simulation and Results

The cadence schematic for the pre-driver is shown in Figure 5-3. Figure 5-4 shows the various outputs of different CML stages. The input from the CMOS stages (red), outputs from latches (violet and orange), output from MUX (green), Output from Cherry Hooper buffer (blue) are shown in this figure.









# 6. VCSEL

### 6.1.Introduction

For optical communications, Vertical cavity surface emitting lasers (or VCSELs) are widely in use. VCSELs are semiconductor laser diodes which have their emission beam perpendicular to the top surface. VCSELs are different from edge emitting semiconductor lasers in that the latter have emissions from surfaces formed by cleaving the individual chip out of a wafer. Figure 6-1 shows the physical cross section of a VCSEL used in optical communications.

As the VCSELs emit beams which are perpendicular to the active region, they hold the advantage of allowing arrayed processing, with many VCSELs placed side by side on a single wafer. Thus, VCSELs find their way easily in the field of optical communications for multi-channel communications.



Figure 6-1 The physical cross section of the semiconductor VCSEL

The VCSELs are typically composed of many different layers:

- The top layer is the electrical contact → This is used to inject current into the VCSEL
- The next layer is a high reflectivity top mirror (≈99% reflectivity). This layer is made of a DBR (Distributed Bragg Reflector).
- 3. The next layer is an oxide layer which constructs a light emitting window to make sure the light beam is circularly optimized.
- 4. The center layer is the laser cavity. It is usually a multiple quantum-well layer. The multiple quantum wells act as an active gain region where lasing happens.
- 5. The next layer is also an oxide layer to shape the light beam into a circular beam.
- Finally, the bottom layer is also a DBR layer with higher reflectivity than the top DBR layer (≈ 99.9% reflectivity)

The quantum wells generate photons which get reflected between the top and bottom DBR mirrors and the light emitted amplifies due to stimulated emission. Since the top DBR mirror has a lower reflectivity than the bottom mirror (which has almost 100% reflectivity), the amplified light is emitted out of the top of the laser.

The Active layer in the VCSEL laser is a quantum-well structure made up typically of a thin strip material with low bandgap trapped between thick materials having higher bandgap. This makes a quantum-well which restricts the carriers within the thin stip. We are going to be using the 850nm InGaAs VCSEL which has been explained and analyzed in (5).

#### 6.2. Static (DC) Response of VCSEL

For communications using a VCSEL, the data is modulated in the power emitted by the VCSEL. This can be controlled by the current injected into the laser. There are two critical DC characteristic curves which are taken into consideration when modulating a VCSEL. They are the V-I and P-I curves.

The V-I characteristics give us the differential resistance of the VCSEL which helps us determine the voltage drop expected for different modulation currents. This is an important parameter as the driver bias point needs to be designed keeping in mind that the transistors don't go out of saturation in case of a large voltage drop across the VCSEL.

The P-I characteristics are shown in (6-1). This is the most important curve for modulating the VCSEL. From the curve, the approximate DC response of the VCSEL output power is given by

$$P_{out} = \eta \times (I_{VCSEL} - I_{TH}) .$$
(0-1)

(c 1)

Here,  $\eta$  is the slope efficiency of the VCSEL and  $I_{TH}$  is the threshold value of the injected current. The VCSEL is turned ON when the injected current is more than the threshold current.

The  $I_{TH}$  of the VCSEL is 0.6mA. The  $\eta$  of the VCSEL is 0.78 mW/mA. And the VCSEL can reach an output power of 9 mW at room temperature. For low power operation, we chose the maximum output power at 9mW. Now, using these three values, we can plan our PAM-4 structure. The bias current of the VCSEL needs to be set higher than 0.6mA

 $(I_{TH})$ , so we chose the bias current at about 1.2 mA. This is done to prevent turn-on delay of the VCSEL.



Figure 6-2 Static Characteristics of VCSEL P v/s I

## 6.3. Dynamic (Frequency) Response of VCSEL

Just like any laser, the VCSEL optical response depends on the electron density N and the photon density  $N_p$  in the laser cavity volume V. Light is the result of the interactions between these densities and can be estimated by solving the following coupled differential equations, also called the rate equations

$$\frac{dN}{dt} = \frac{I_{VCSEL}}{qV} - \frac{N}{\tau_{sp}} - GNN_p \quad \text{and} \quad (6-2)$$

$$\frac{dN_p}{dt} = GNN_p + \beta_{sp} \frac{N}{\tau_{sp}} - \frac{N_p}{\tau_p}.$$
(6-3)

In these equations, the term  $\frac{N}{\tau_{sp}}$  represents the decay of electrons due to spontaneous emission with  $\beta_{sp}$  to be the spontaneous emission coefficient and  $\tau_{sp}$  to be the lifetime of the electrons. This time constant lifetime also includes losses due to leakage and nonradiative recombination. The stimulated emission is represented by the term  $GNN_p$  where G is the stimulated emission coefficient. We can see intuitively that the spontaneous emission is proportional to both the electron and photon density in the cavity.  $I_{VCSEL}$  is the current which is injected into the VCSEL and  $\tau_p$  is the lifetime of photons.

The output optical power  $P_0$  of the laser is given by

$$P_o = N_p h v V v_g \,. \tag{6-4}$$

Here, *h* is the Plank's constant, *v* is the frequency of emitted light and  $v_g$  is the light group velocity.

Combining the rate equations above and taking the Laplace transform, including equation (6-4), we get the following second order transfer function for the VCSEL optical power  $(P_o)$  with respect to the injected current  $(I_{VCSEL})$ 

$$\frac{P_o(s)}{I_{VCSEL}(s)} = \frac{h\nu v_g \alpha_m}{q} \times \frac{GN_p}{s^2 + s\left(GN_p + \frac{1}{\tau_{sp}}\right) + \frac{GN_p}{\tau_p}}.$$
(6-5)

Here  $\alpha_m$  is the VCSEL mirror loss coefficient.

Equation (6-5) is in the form of a second-order low-pass filter as given by

$$H(f) = const \times \frac{f_r^2}{f_r^2 + j\left(\frac{f}{2\pi}\right)\gamma - f^2}.$$
(6-6)

Comparing equations (6-5) and (6-6), the relaxation oscillation (or resonance) frequency, which is related to the bandwidth of the filter, is given by

$$f_r = \frac{1}{2\pi} \sqrt{\frac{GN_p}{\tau_p}} .$$
(6-7)

The photon density  $(N_p)$  is directly proportional to the current in the VCSEL above the threshold current  $(I_{TH})$ , hence the resonance frequency is proportional (from equation (6-7)) to the root of the current above threshold. It can be given by

$$f_r = D\sqrt{I_{VCSEL} - I_{TH}} \,. \tag{6-8}$$

Here, D is the proportionality constant. Also, the damping factor  $\gamma$  is given by

$$\gamma = K f_r^2 + \gamma_0 \,. \tag{6-9}$$

We can infer from equations (6-8) and (6-9) that if we keep increasing  $I_{VCSEL}$ , we cannot increase the bandwidth indefinitely as the damping factor also increases. Since the resonant frequency (which is a direct indication of the bandwidth) of the VCSEL is directly related to the bias current  $I_{VCSEL}$ , the change in the VCSEL current for data modulation causes the

VCSEL to be a non-linear system. Figure 6-3 below shows the frequency response of the laser with its bias current.



Figure 6-3 Dynamic Characteristics of VCSEL – Frequency Response

As shown in (6), higher bias currents output a higher bandwidth response. Although this is favorable, higher bias currents consume more power. To allow lower bias currents and save power, Pre-emphasis techniques are employed using FIR filters to compensate for the bandwidth.

### 6.4. VCSEL Modelling

For correct simulations, we need to have the correct VCSEL model to be used. When building the model, it is best to represent the model to be as accurate to the real VCSEL response as possible.

The simplest model is to use the small-signal approximation in which the modulation response is the same for both one and zero response (by fixing a bias current). This is not the best way to represent the VCSEL as the real device has a non-linear response where the bandwidth is directly related to the current flowing through the VCSEL. Hence, using such linearization to represent the VCSEL would make the results deviate from the expected output for large extinction ratios.

On the other hand, we can build the model of the VCSEL using actual rate equations, as described by equations (6-2) and (6-3). However, this would be extremely hard to simulate in a circuit simulator and such a complex model would cause solution convergence issues when used for transient simulations.

To strike a balance, we use the model proposed in (6) which touches the middle ground between the above two extremes. This model considers the effect of variation in bias current through the VCSEL and models the characteristics accordingly. For accuracy, the model separates the intrinsic optical dynamics and the extrinsic electrical parasitics of the VCSEL.

#### 6.4.1. Electrical Model



Figure 6-4 Electrical Model of 850nm VCSEL for simulations

The electrical model of the VCSEL includes all the electrical parasitics included with the device. The important parameters that we need to look at are the pad capacitance  $(C_{pad})$  and pad resistance  $(R_{pad})$ . Typically, the values for these parameters are 10 fF and 1  $\Omega$  respectively. Next, the parasitics of the device itself are taken into consideration. These are represented using series reflector (DBR) resistance  $(R_{DBR})$  which is typically about 50  $\Omega$ . We also use the junction capacitance  $(C_j \approx 110 fF)$  and the junction resistance  $(R_j \approx 160 \Omega)$ . The typical circuit diagram of the electrical sub-model of the VCSEL is shown in Figure 6-4.

As we can see, the parasitic capacitors provide a low-impedance path for current flow at higher frequencies, thereby reducing the actual current available for electrical to optical conversion. This current is represented by  $I_{R_j}$ .

#### 6.4.2. **Optical Model**



Figure 6-5 Optical Model of the VCSEL

We saw in Section 6.2 that the output power is proportional to the current above the threshold driven through the VCSEL. This is represented in the optical model by a dependent source with  $\eta$  (the slope efficiency) as the proportionality constant. Next, we saw in Section 6.3 that the VCSEL exhibits the optical response as a second order low pass filter with damping. We emulate this behavior using a second order LPF rendition employing an RLC circuit. Also, we make it dynamic and non-linear by taking into account the dependencies in equations (6-8) and (6-9).

Figure 6-5 shows the optical model. The output power is a non-linear low-pass version of the input power. The transfer function of this model is given by

$$\frac{P_{out}(f)}{\eta(I - I_{TH})(f)} = \frac{1}{1 - L_{VL}C_{VL}\left(\frac{f}{2\pi}\right)^2 + j\left(\frac{f}{2\pi}\right)R_{VL}C_{VL}}.$$
(6-10)

We can fix the values of  $L_{VL}$ ,  $C_{VL}$ , and  $R_{VL}$  by comparing Equation (6-10) with Equation (6-6). As the former has one variable more than the latter, we fix  $C_{VL}$  to be 100 *fF*. Now, we can estimate the values of  $L_{VL}$  and  $R_{VL}$  as

$$L_{VL} = \frac{1}{4\pi^2 C_{VL} D^2 (I - I_{TH})} \text{ and}$$
(6-11)

$$R_{VL} = (Kf_r^2 + \gamma_0) \times L_{VL} \,. \tag{6-12}$$

The values of the constants to estimate the values of  $L_{VL}$  and  $R_{VL}$  are shown in Table 6-1.

| PARAMETER                            | VALUE                       |
|--------------------------------------|-----------------------------|
| Threshold Current $(I_{TH})$         | 0.6 <i>mA</i>               |
| Slope Efficiency $(\eta)$            | 0.78 mW/mA                  |
| D-Factor (D)                         | $7.6 \frac{GHz}{\sqrt{mA}}$ |
| K-Factor (K)                         | 0.25 <i>ns</i>              |
| Damping Factor Offset ( $\gamma_0$ ) | 37 ns <sup>-1</sup>         |

Table 6-1 Parameters used for describing VCSEL in optical model

The complete dynamic Model for the VCSEL is shown in Figure 6-6. This model takes the modulated current as the input and gives out the power. This considers both the electrical parasitics and the optical limitations of the VCSEL. It also includes the non-linear dependence of resonant frequency and damping factor on the VCSEL current into consideration through the parameterized  $L_{VL}$  and  $R_{VL}$  in the model.



Figure 6-6 Complete VCSEL model to be used in simulations

#### 6.5. Other Considerations

#### 6.5.1. Turn-On Delay

When a VCSEL is turned on, from a current flow into the VCSEL, the rising edge of the optical pulse trails the rising edge of the electrical pulse. This is called turn-on delay. Experiments have shown that this delay is higher if the current in the OFF state is less than the  $I_{TH}$  of the VCSEL, and reduces as the OFF-state current goes above the threshold current. As this effect comes into picture only on the rising edge of the data stream, without affecting the falling edge, a high turn-on delay can lead to pulse distortion. Hence, to alleviate this effect, we need to make sure to bias out OFF state above the  $I_{TH}$  of the VCSEL.

### 6.5.2. Off-State Bounce in VCSEL

An important phenomenon which manifests itself in the falling edge of the VCSEL is the Off-State Bounce of the VCSEL. When designing a driver, sufficient care must be taken to mitigate this effect. This phenomenon is clearly explained in (7). The VCSELs typically have a multi-transverse modal structure to prevent modal noise. In this structure, the VCSEL modes are spatially separated and surrounded by regions which are forwardbiased, but not lasing. When a lasing region is turned off, a charge carrier gradient is produced which draws carriers from the surrounding material, briefly raising the region back above the carrier density required for lasing. This produces a small power "bounce" up to several picoseconds after the pulse falling edge. For relatively low speed applications, this is not a major problem, but it is significant in this project where data speeds are around 12.5 Gbps.

Figure 6-7 shows the optical response a VCSEL without any kind of pre-emphasis. The response shows Relaxation Resonance at the rising edge of the VCSEL and the Offstate bounce at the falling edge of the VCSEL. As these are cause by two different phenomena, and produce different responses, they need to be dealt with using different (asymmetrical) pre-emphasis for each edge, rising and falling. Only when both these are tackled differently (in contrast to simply using the linear FIR pre-emphasis technique), we get a better eye opening and reduced jitter in the eye diagram.



Figure 6-7 Optical response of VCSEL

## 6.6.Cadence Modelling

### 6.6.1. Optical Low Pass Filter Design

The Verilog code for the Optical low pass filter response of the VCSEL is given by:

```
// VerilogA for LaserDriver, LPFoptical, veriloga
`include "constants.vams"
`include "disciplines.vams"
module LPFoptical(a,b,c);
    inout a,b,c;
    electrical a,b,c;
    electrical n;
    parameter real eta = 0.78; // eta = Slope Efficiency (0.78 mW/mA)
    parameter real Ith = 0.6m; // Ith = Threshold Current (0.6mA)
    parameter real Dsg = 57.76e21;// Dsg = D-Factor Squared (D = 7.6
GHz/(mA^0.5))
    parameter real K = 0.25n;//K = K Factor (0.25ns)
    parameter real gamma0 = 37e9;// gamma0 = Damping Factor Offset (37
/ns)
    parameter real CVL = 100f; // CVL = Capacitance of Optical LPF
(Arbitrary Value Fixed to 100 fF)
    real pi = 3.142;
    analog begin
        I(c,b) <+ CVL*ddt(V(c,b));</pre>
        V(a,n) <+ eta/(4*pi*pi*CVL*Dsq*V(a,b))*ddt(I(a,n));</pre>
        V(n,c) <+ I(n,c)*((K*Dsq*V(a,b) +
gamma0*eta)/(4*pi*pi*Dsq*CVL*V(a,b)));
    end
endmodule
```

The optical LPF symbol is constructed using this Verilog code (shown in Figure 6-8).



Figure 6-8 Cadence Symbol for Optical LPF emulation in the VCSEL

## 6.6.2. VCSEL Design

The electrical and optical parts of the VCSEL described in (6) were built using Cadence and the output simulated power was measured. The VCSEL was designed as shown in Figure 6-9.



Figure 6-9 Cadence Schematic of the VCSEL Model (both static and dynamic)

The left part of the circuit is the electrical half of the laser. The right part is the optical part of the laser and gives us the output power in mW. We can see that the optical output is an inherent low pass filter being emulated here by a Verilog A module. The symbol for the VCSEL is shown in Figure 6-10.



Figure 6-10 Cadence symbol for the VCSEL model taking in Current (mA) and emulating output power (mW) as a Voltage

## 7. DRIVER (PAM-4)

### 7.1.Top-Level Design

The Cherry-Hooper Pre-driver provides an amplified, high-speed current to the Driver. The driver architecture proposed in this work is shown in the Figure 7-1. The architecture is made up of three blocks, viz., the delay chain, the pre-emphasis driver and the main driver unit. The main driver needs to be able to drive large currents (upto 10mA) from the VCSEL. The pre-emphasis pulse width is set by a 4- tap digital delay line, which is selected by 2 switches  $S_0$  and  $S_1$ .

The delay chain is connected to the Pre-Emphasis Driver unit which supplies a scaled. inverted and delayed signal to be added onto the Main Modulated Signal to provide 1-tap FIR pre-emphasis.

The VCSEL is cathode connected to the driver, while being supplied at the anode by 3V.



Figure 7-1 The top-level block diagram of the Driver and Cherry Hooper Predriver

## 7.2. Delay Tap

The delay chain is constructed of delay cells, each delay cell made up of 2 buffers cascaded in series. We have a total of 3 delay cells for selective tap delay. Each delayed branch is selected through a 2-bit selection panel  $S_1S_0$ . 00 gives the smallest delay (30ps) and each added bit gives 10ps more delay. The delays given by the chain are enumerated in Table 7-1.

| Switch Selection<br>(S <sub>1</sub> S <sub>0</sub> ) | Delay |
|------------------------------------------------------|-------|
| 00                                                   | 30ps  |
| 01                                                   | 40ps  |
| 10                                                   | 50ps  |
| 11                                                   | 60ps  |

Table 7-1 The delays introduced by the buffer delay tap selection bits



Figure 7-2 Cadence schematic of CML buffer in TSMC65nm Technology

Each delay cell is constructed in the following way:



Figure 7-3 Cadence schematic of CML buffer 'Delay-Cell' in TSMC65nm Technology

The delay tap has the following construction:



Figure 7-4 Cadence schematic of CML 'Delay-Tap Chain' in TSMC65nm Technology



Figure 7-5 Cadence simulation of the transient response of the Delay Tap

The output of CML 'Delay-Tap Chain' in TSMC65nm Technology is shown in Figure 7-5. The input is plotted in red, the  $S_{00}$  delay configuration is plotted in blue. The traces in violet, orange and green represent the additional delays (to be added to  $S_{00}$  delay) for the delay configurations of  $S_{01}$ ,  $S_{10}$  and  $S_{11}$  respectively.

## 7.3. The Main Driver unit

The Main Driver unit is shown in Figure 7-6. The figure shows the output stage, which consists of two current switches with differential input and single-ended output. Cascodes improve the current switching symmetry by isolating the drains of the input differential pairs from the unbalanced output swing.
A secondary source  $(V_H)$  provides the bias current to the VCSEL which is determined by the resistance  $R_{BIAS}$ . Finally, the output is matched with the transmission lines leading to the VCSEL by terminating the driver with  $R_{MATCH} = 50\Omega$ .

As the PAM-4 structure requires the current to be modulated in 4 distinct levels, we need 2 current switches having their currents in the ratio of 1:2 to equally display each eye.



Figure 7-6 Main PAM-4 Driver Architecture

## 7.3.1. Cadence Schematics



Figure 7-7 Cadence Schematic of the Main Driver Unit

Figure 7-7 shows the schematic of the Main driver unit. The schematic also includes the package parasitics introduced by a QFN package. The circuit is terminated by a 50  $\Omega$  resistance. This resistance is built from *rppoly* resistor (poly p+ doped with salicide) to achieve the low resistance required.

## 7.4. Pre-emphasis Driver

#### 7.4.1. FIR Filter Equalization

One way of achieving the high pass characteristic at the driver is by using inductor peaking. A peaked current can be achieved by adding a speedup inductor as a shunt to the driver. This approach can be done either by using a passive inductor or an active inductor. While this approach was shown to work, it has many disadvantages. It introduces a significant area overhead (in the case of passive inductor), but, most importantly, the parameters of the inductor-based high-pass filter are not tunable, making it harder to exactly match the VCSEL characteristic.

In a digital system, the high-pass driver can be realized with a latch-based FFE. One advantage of a digital FFE is that the filter tap delay automatically tracks the bit interval. This feature, however, is only important for filters with a large number of taps. The power and area overheads associated with clock distribution and latching of high-speed data make a digital FFE system far less attractive than a continuous-time FFE where the delay between the filter taps is created by a tunable delay line (6).

The Pre-emphasis emphasis driver is based on a 1-tap FIR filter design. The preemphasis data is acquired from the original signal itself, by delaying, inverting and scaling the original signal. This pre-emphasis pulse is added onto the original pulse, to (as the name suggests) emphasize the high frequency components of the transmitted signal. This helps in balancing the non-linearity of the VCSEL, giving a cleaner optical eye at the output. The graphical illustration of the FFE circuit operation (laser driver with pre-emphasis) is shown in the Figure 7-8. The delay is selected using the 2-bit buffer delay cell and the tap weight is judged by looking at the output without any pre-emphasis.



Figure 7-8 The current waveform before and after FFE pre-emphasis

The FIR filter output is the subtraction of a scaled and delayed discrete time signal can be shown to be

$$y(n) = x(n) - \alpha . x(n - n_0).$$
 (7-1)

(7 1)

Here  $n_0$  is the delay. For achieving higher bandwidth, we choose  $n_0$  to be less than one bit period. Tu understand this, the plots of frequency v/s amplitude for different values of  $n_0$  are plotted in Figure 7-9 Change in high pass response of the FIR filter for different fractional delays  $n_0$ .



Figure 7-9 Change in high pass response of the FIR filter for different fractional delays  $n_{-0}$ 

For the analysis, we take the discrete time Fourier transform of a discrete signal (7-1).

Taking FT:

$$Y(\omega) = X(\omega) - \alpha X(\omega) \cdot e^{-j\omega n_0}.$$
(7-2)

From (7-2), we get the transfer function:

$$H(\omega) = \frac{Y(\omega)}{X(\omega)} = 1 - \alpha. e^{-j\omega n_0} \text{ and}$$
(7-3)

$$|H(\omega)^2| = (1 - \alpha . \cos(\omega n_0))^2 + (\alpha \sin(\omega n_0))^2$$
  

$$\Rightarrow |H(\omega)|^2 = 1 + \alpha^2 - 2\alpha . \cos(\omega n_0).$$
(7-4)

As can be clearly seen from (7-4), the bandwidth of the FIR High pass response can be tuned by tuning the value of  $n_0$  (the 3dB frequency can be increased by introducing a fractional delay).

#### 7.5. Driver Simulation

Driver Module with delay is shown in Figure 7-10. The PAM-4 Laser Driver with Pre-emphasis is shown in Figure 7-12. The full transmitter with the VCSEL is shown as tested in Figure 7-13. The NRZ eye diagram with and without pre-emphasis is shown in Figure 7-13. The PAM4 eye diagram with and without pre-emphasis is shown in Figure





Figure 7-10 Driver module with delay blocks for pre-emphasis







Figure 7-12Cadence Schematic of Full transmitter test bench with VCSEL



Figure 7-13 Eye diagram of NRZ Driver with (a) and without pre-emphasis (b)  $t_{del} = 28 \ ps$  (c)  $t_{del} = 36 \ ps$  (d)  $t_{del} = 44 \ ps$ 





From the graphs above, the vertical eye for PAM-4 modulation and NRZ Modulation show really good improvements. There is also significant improvement in the jitter. These values would change for different delays set by setting switch ( $S_{ab} a, b = 0,1$ ). The improvements in performance for different values of NRZ and PAM-4 communication using the design is shown in Table 7-2 and Table 7-3 respectively. This shows that pre-emphasis increases the VCSEL optical performance.

Table 7-2 NRZ Pre-emphasis performance improvements

| NRZ Mode – Improvements from pre-emphasis FFE equalization |                                  |                    |  |  |
|------------------------------------------------------------|----------------------------------|--------------------|--|--|
| Pre-emphasis Tap Delay                                     | Eye Vertical Opening Improvement | Jitter Improvement |  |  |
| 28 ps                                                      | 7.3%                             | 15.05%             |  |  |
| 36 ps                                                      | 12.7%                            | 26.22%             |  |  |
| 44 ps                                                      | 13.57%                           | 29.9%              |  |  |

| PAM-4 Mode – Improvements from pre-emphasis FFE equalization |             |             |             |             |
|--------------------------------------------------------------|-------------|-------------|-------------|-------------|
| Due englacia                                                 | Top-Eye     | Middle-Eye  | Bottom-Eye  |             |
| Top                                                          | Vertical    | Vertical    | Vertical    | Jitter      |
| Delay                                                        | Opening     | Opening     | Opening     | Improvement |
|                                                              | Improvement | Improvement | Improvement |             |
| 20 ps                                                        | 4.8 %       | -3.9 %      | 0.1 %       | 6.1 %       |
| 28 ps                                                        | 15 %        | 9.1%        | 15.3%       | 15.87%      |
| 36 ps                                                        | 29.1%       | 24.3%       | 15.45%      | 13.97%      |
| 44 ps                                                        | 29.5%       | 21.4%       | 11.7%       | 13.8%       |

 Table 7-3 PAM-4 Pre-emphasis performance improvements

The power consumption of each instance of the components are given in Table 7-4.

| COMPONENT                   | POWER CONSUMPTION |  |
|-----------------------------|-------------------|--|
| 32:2 Serializer             | 48mW              |  |
| Cherry-Hooper Pre-Amplifier | 27mW              |  |
| 2-bit Delay Tap             | 49mW              |  |
| VCSEL Driver + VCSEL        | 16mW              |  |
| Biasing Circuits            | 3.3mW             |  |
| Total Power                 | 143.3mW           |  |

 Table 7-4 Power consumption of the various components in the design

We need to provide 3 bias voltages (latch, mux/buffer and driver/pre-driver). Each bias voltage is provided by a current mirror with 1mA in each branch.

This gives us a total power of  $131.38 \ mW$  As we are successfully achieving a total PAM-4 speed of 25.8 Gbps, the energy efficiency achieved by the transmitter (serializer + driver) is 5 pJ/bit.

### 8. CONCLUSION AND FUTURE WORK

In this work, the design of a high-speed laser driver circuit is presented. The completed design process from the literature review to the final driver schematic characterization is documented. The driver + serializer designed in the 65 nm TSMC technology. The proposed 65 nm driver achieved almost 5 pJ/b energy efficiency at a 25.8 Gbps bit-rate by using PAM-4.

The driver has a switch to select NRZ (SEL= 0) and PAM4 (SEL = 1). It also has a 2-bit switch to select the delay of the pre-emphasis data. The wire-bond parasitics were modelled to fit a QFN package and the chip was tested with this model parasitics included.

The VCSEL to be driven was modelled using Verilog A to emulate the non-linear conversion of current to power by the VCSEL. The output power of the VCSEL is measured by emulating a voltage to show the output power. The driver exhibited improvement over the driver circuits presented in literature, however, the energy efficiency and performance simulation results did not include post-layout parasitics, which are expected to negatively affect the performance.

As the design presented is not ready for fabrication, the first future steps would be to proceed with the layout design and verification. Based on the insights we obtained during this study, the on chip parasitics should be important but not be detrimental to the performance of the driver. Additional variability tests will also be necessary, however since the latter half of the chip is composed solely of N-MOS transistors the design corners will only affect the CMOS serializer stage. After the fabrication of the chip is completed, various experimental setups could be devised to test both the potential of the driver as well as the VCSEL.

Due to the versatility provided by the chip, by utilizing the selection voltages, the driver could be adapted to be used in different link scenarios and maybe even different VCSEL diodes.

In terms of expanding or improving the driver design, the most straightforward steps would be to implement alternative topologies in the driving stages in order to identify the best way to drive the VCSEL diode load. Additionally, on a link perspective, error correction or equalization could be included to minimize the BER if the eye opening presented in this report gets degraded by the transfer medium and noise. These stages however, would have a negative impact in the energy efficiency achieved, but will help decrease the bit error rate.

In conclusion, in this report we have demonstrated a PAM-4 laser driver circuit based on CML architecture, and implemented in 65nm TSMC process. The models of the interconnects and the VCSEL diode provided insight on the scattering parameters that the signal will experience from the driver to the load and allowed to better assess the complete system performance.

# BIBLIOGRAPHY

1. **www.itrs.net**, *International Technology Roadmap for Semiconductors*. [Online] 2007. http://www.itrs.net/Links/2007ITRS/ExecSum2007.pdf.

2. **Technologies, Keysight.** *PAM-4 Design Challenges and the Implications on Test - Application Note.* [Online] 2015. literature.cdn.keysight.com/litweb/pdf/5992-0527EN.pdf.

3. **Bryan Casper, Frank O'Mahony.** *Clocking Analysis, Implementation and Measurement Techniques for High-Speed Data Links—A Tutorial.* Jan 2009, Vol. 56.

4. Lai, Yeong-Lin, Cheng-Yu Ho. s.l. *Electrical modeling of quad flat no-lead packages for highfrequency IC applications.* : IEEE, 2004. TENCON 2004. 2004 IEEE Region 10 Conference. Vol.500

5. Westbergh, P., Gustavsson, J. S., Haglund, Å., Skold, M., Joel, A., & Larsson, A. *High-speed, low-current-density 850 nm VCSELs.* 3, 2009, IEEE Journal of Selected Topics in Quantum Electronics, Vol. 15, pp. 694-703.

6. **Raj, Mayank, Manuel Monge, and Azita Emami.** A Modelling and Nonlinear Equalization *Technique for a 20 Gb/s 0.77 pJ/b VCSEL Transmitter in 32 nm SOI CMOS.* 8, 2016, IEEE Journal of Solid-State Circuits , Vol. 51, pp. 1734-1743.

7. **Finisar Advanced Optical Components Division.** *Application Note - Modulating VCSELs.* Finisar Website. [Online] https://www.finisar.com/sites/default/files/downloads/ application\_note\_modulating\_vcsels. pdf.

8. **Hyun, Seok Hun.** *DESIGN OF HIGH-SPEED CMOS LASER DRIVER USING.* School of Electrical and Computer Engineering, Georgia Institute of Technology. Atlanta : s.n., Nov 2004. Doctoral Dissertation.

9. Kern, Alexandra, Anantha Chandrakasan, and Ian Young. s.l. *18Gb/s optical IO: VCSEL driver and TIA in 90nm CMOS.* : IEEE, 2007. VLSI Circuits, 2007 IEEE Symposium . pp. 276-277

## **Appendix A - VCSEL Frequency Response Testbench**

The frequency analysis cadence Testbench to locate the optimum delay can be

seen in Figure A-1.



Figure A-1 VCSEL AC Analysis Testbench

We specify the modulation current, bias current and  $\alpha$  and  $t_{del}$  for FFE equalization. Various parametric plots were analyzed based on this Testbench by fixing  $\alpha$  and varying  $t_{del}$  and vice versa. The VCSEL Output Power magnitude (measured in dB) is plotted against the frequency of current modulation for  $\alpha = 0.135$  and for delays from 0 ps to 40 ps. This is shown in Figure A-2.



Figure A-2 Frequency Response of VCSEL Output Power

The optimum delay range appears to be between 20 *ps* and 40 *ps*. So, we use this to design the range of our 2-bit buffer.