Hybrid Clock Recovery for a Gigabit POF Transceiver implemented on FPGA

Julio Ramírez, Antonino Nespola, Stefano Straullu, Paolo Savio
Silvio Abrate, Member, IEEE, and Roberto Gaudino, Senior Member, IEEE

Abstract—In this paper we present a Clock Recovery System implemented on FPGA and integrated to the Gigabit Ethernet Media Converter for PMMA SI-POF developed within the framework of the POF-PLUS EU Project. We demonstrate timing synchronizing using only one sample per symbol from a highly distorted and attenuated 2-PAM signal without requiring any sort of pre-equalization. This is achieved by means of a Hybrid analog-digital PLL with a Timing Error Detector based on a modified version of the Müller and Mueller algorithm, a Loop Filter and a VCXO.

Index Terms—Clock Recovery, FPGA, DSP, PLL, TED, Gigabit Ethernet, Optical Communications, Polymer Optical Fiber

I. INTRODUCTION

In recent years the increasing demand for bandwidth has driven the demand for higher performance. As a consequence, traditional communication solutions have been dramatically improved and significant research efforts have led to the creation of new technologies capable to cope with ever-increasing requirements. As part of this trend, European Telecom Operators, together with the European Union, have been actively working and creating policies to bring broadband access to the European continent. In this sense, the European Seventh Framework Program 7 (FP7) hosted the POF-PLUS Project [1], an initiative aimed to promote research and development of short-range optical communication solutions based on Plastic Optical Fiber (POF) to provide wired and wireless services for in-building/in-home networks and to investigate the feasibility of optical interconnects applications. As reported on [2, 3], this initiative led to the implementation on a Field Programmable Gate Array (FPGA) of a Gigabit Ethernet Media Converter in full compliance with the IEEE 802.3 Ethernet Standard and capable of overcoming the impairments introduced by the POF channel. As shown in paper [4] the most critical issue in POF transmission schemes is to overcome the severe limitations in terms of available channel bandwidth.

In fact, the electrical to electrical available 6dB bandwidth (from the electrical input of the transmitter to the electrical output of the photodiode) is below 100 MHz, while the Media Converter transmits above 1 Gbit/s. As a result, the received eye diagram is completely closed due to inter-symbol interference. On top of this, due to fiber attenuation, the received signal after the POF target length is very small, so the signal to noise ratio is also very small. The key elements of the proposed architecture are thus:

- A highly optimized equalization algorithm to overcome inter-symbol interference (explained in detail in [2])
- Forward Error Correction (FEC) in the form of a (255, 237) Reed-Solomon (RS) code [2]
- A clock recovery system based on a properly optimized Phase-Locked Loop (PLL), thereby able to recover synchronism with a completely closed eye diagram.

As stated in [3], the first versions of the transceiver did not include a clock recovery (CR) system, therefore in order to test and debug the proposed architecture, it was necessary to bypass the clock between the transmitting and receiving nodes. In summary, the system was able:

- to perform 2-PAM (Pulse Amplitude Modulation), Resonant Cavity Light Emitting Diode (RC-Led) based transmission over 50+ meters of standard A4a.2 1mm Poly-Methyl-MethA-crylate Step-Index Plastic Optical Fiber (PMMA SI-POF) with a high optical power margin of 4 dB;
- to run real traffic, implementing a complete media converter between standard Gigabit Ethernet 1000Base-T and the PMMA SI-POF.

After having successfully validated the operation of this first prototype, we proceeded to complete it by implementing the required timing recovery system.

In this paper we describe the chosen CR architecture and its hardware implementation on FPGA. In particular, we demonstrate the timing recovery capabilities of the system for continuous-mode data transmission based on 2-PAM signals without requiring pre-equalizing schemes and achieving the full functionality of the previously validated Media Converter. Next section will present the designing process of the system.
II. TIMING RECOVERY SYSTEM

A. POF Channel

The first step of the design process consists on obtaining an expression to model the impairments inflicted by the POF channel on the received signal.

\[ P(\omega) = H_{TX}(\omega) \cdot H_{POF}(\omega) \cdot H_{RX}(\omega) \]  

(1)

where \( H_{POF}(\omega) \) is modeled as a linear time invariant (LTI) low pass filter, while \( H_{TX}(\omega) \) and \( H_{RX}(\omega) \) correspond to the theoretical transfer functions of the 2-PAM transmitter plus RC-LED and the optoelectronic receiver PD, respectively. Once the channel is modeled, the signal at the output of the optoelectronic receiver \( Y_{R}(t) \) can be expressed as

\[ Y_{R}(t) = \sum_{n=\infty}^{\infty} x_{n} p(t - \epsilon T) + v(t) \]  

(2)

where \( x_{n} \) denotes the transmitted 2-PAM symbols, \( v(t) \) is the inherent additive colored Gaussian noise introduced during the optoelectronic conversion and \( \epsilon T \) is the unknown fractional time delay between transmitter and the receiver \((-1/2 < \epsilon < 1/2)\).

In order to maximize noise immunity, \( Y_{R}(t) \) must be sampled at instants of maximum eye opening, referred as optimum sampling instants; their individuation implies adjusting the phase of the sampling clock according to \( \epsilon T \). For this purpose, the receiver must contain a clock synchronizer, which is a device that makes the estimation \( \epsilon \) of the mentioned delay [5].

There are two main types of clock synchronizers, which are categorized, depending on their architecture schemes, as feed forward and feedback synchronizers, the latter also referred as error-tracking synchronizers [5]. Further classifications based on other criteria can be made. For instance, if the synchronizer relies on decided symbols to produce a timing estimate, then it is defined as decision directed, otherwise, it is non-data aided [5]. Moreover, it can be further categorized depending on its operation domain, i.e. analog or digital, as being a continuous or discrete time system; and depending on the data transmission mode as being a burst or continuous mode clock synchronizer.

For the present case, it was decided to implement a continuous-mode \& non-data aided Error Tracking Synchronizer by means of a hybrid analog-digital architecture.

B. Hybrid Synchronizer Architecture

The general diagram of the resulting system is depicted in Fig. 3. As seen, the incoming symbols transmitted at a line rate of 1.0991 Gbps (nominally 1.1Gbps) are sampled by the on board analog to digital converter (ADC), which operates in Double Data Rate (DDR) mode, then the samples are forwarded to the timing Error Detector (TED), based on the Müller & Mueller algorithm [6], then the error signal is averaged by the loop filter and finally is converted to the analog domain by a \( \Delta \Sigma \) modulator followed by an RC filter, which together operate as DAC [7] and drive the VCO. It is hence evident that error-tracking synchronizers apply the PLL concept to derive a sampling clock from the received signal [5].

![Fig. 3  Hybrid Clock Recovery Architecture](image)

Furthermore, it should be noticed that in order to maximize flexibility and ease eventual upgrades, it was decided to implement most of the system inside the FPGA, so that the scalability of our architecture towards higher bit rates is mostly limited by the analog devices present in the loop, and in particular by the capabilities of the ADC converter. Due to the architecture shown in Fig. 3, this is the “fastest” and most critical circuit required since, inside the FPGA, all the subsequent signal processing is done using highly parallelized algorithms. In fact, the used FPGA clock is 275 MHz, significantly smaller than the bit rate.

In the following, the design and implementation of each block composing this hybrid architecture is presented.

**Müller & Mueller TED**

As aforementioned, the M&M TED is implemented according to the timing recovering methods proposed in [6]. Typically defined as a decision-directed synchronizer [5], its
conventional implementation diagram, as part of a clock recovery system, is shown in Fig. 4.

As seen, this device derives the delay $\Delta T$ by estimating the error $e_k$ between the equalized PAM signal $Y_s(k)$ and the decided symbols $a_k$, so that, assuming sample times $t = kT$, we have that the error for the $k^{th}$ symbol is expressed as [6]

$$e_k = a_{k-1}Y_s(kT + \Delta T) - a_kY_s((k-1)T + \Delta T)$$  \hspace{1cm} (3)

As it can be noticed, the diagram shown in Fig. 4 differs from the architecture proposed in Fig. 3 in the way in which $e_k$ is derived, so that instead of estimating the phase error based on the pre-equalized signal $Y_s(k)$ and its corresponding decided symbol, the M&M TED is modified in such a way that it requires only the un-equalized received samples $Y_R(k)$. The redefined M&M algorithm can be expressed as

$$e_k = \text{sign}_{k-1}Y_R(kT + \Delta T) - \text{sign}_kY_R((k-1)T + \Delta T)$$  \hspace{1cm} (4)

where $\text{sign}_k$ denotes the sign of the $k^{th}$ sample of the received signal $Y_R(k)$.

The M&M TED, as defined by (4), was first implemented in the FPGA and then validated and parameterized. As part of the parameterization process, it is of the utmost importance to determine the sensitivity $K_d$ of the TED, because it is required to design the Loop Filter. In order to retrieve its value it is necessary to derive the S-Curve, which results from plotting the estimated phase error $\hat{\theta}$ (measured at the output of the TED operating in open loop) versus the actual phase difference.

$$\theta_k = \text{sign}_{k-1}Y_R(kT + \Delta T) - \text{sign}_kY_R((k-1)T + \Delta T)$$  \hspace{1cm} (5)

$$\hat{\theta}(t)$$  \hspace{1cm} (6)

$$\hat{\theta}(z)$$  \hspace{1cm} (7)

where $\text{sign}_k$ denotes the sign of the $k^{th}$ sample of the received signal $Y_R(k)$.

The setup used to validate the system included a pseudo-random binary sequence (PRBS) generator that allowed delaying the transmitted signal with respect to the transmitting clock. For the experiment different length PRBS sequences presenting different phase delays were transmitted, then filtered by an RC filter similar to the POF channel and finally forwarded to the TED. The resulting S-Curve is shown in Fig. 5. In order to estimate $K_d$, it is necessary to determine the slope of the curve in the vicinity of the zero crossing point. For this particular case, this analysis yielded a value for $K_d$ of 0.35 V/ rad.

**Loop Filter and closed loop analysis**

This section describes the mathematical model used for designing the Loop Filter. The analysis is performed using the analog model of a PLL shown in Fig 6. The transfer function of the analog PLL is expressed as [5]

$$H(s) = \frac{\theta_o(s)}{\theta_i(s)} = \frac{K_dK_oF(s)}{s + K_oK_dF(s)}$$  \hspace{1cm} (8)

where $\theta_o$ and $\theta_i$ represent the phase of the VCO and of the incoming signal respectively, $F(s)$ is the transfer function of the loop filter, and $K_o$ is the gain of the VCO.

![PLL Block Diagram](image)

**Fig. 6 PLL Block Diagram**

For the purposes of this project, it was decided to implement a second order PLL capable of tracking the phase and frequency deviations of the incoming signal with respect to the clock generated by the VCO. Such a device is obtained by designing the loop filter under the form of an integrator. Accordingly, the resulting transfer function of the loop filter can be expressed as [5]

$$F(s) = \frac{\tau_2s + 1}{\tau_1s} = \left[ \frac{K_1 + K_2}{s} \right]$$  \hspace{1cm} (9)

where $K_1 = \tau_2/\tau_1$, and $K_2 = 1/\tau_1$, and $\tau_1$, $\tau_2$ are the time constants $RC$ of the filter. Now, by substituting (9) in (5) the transfer function of the PLL becomes

$$H(s) = \frac{K_dK_o(K_2 + K_5s)}{s^2 + sK_dK_oK_1 + K_dK_oK_2}$$  \hspace{1cm} (10)

from which the loop gain can be derived as $K = K_dK_oK_1$.

Equivalently, (7) can be expressed in terms of the natural frequency $\omega_n$ and damping factor $\zeta$ as

$$H(s) = \frac{2\zeta\omega_n s + \omega_n^2}{s^2 + 2\zeta\omega_n s + \omega_n^2}$$  \hspace{1cm} (11)

where

$$\omega_n = \frac{\sqrt{K_dK_o}}{\tau_1} = \sqrt{K_dK_oK_2} \hspace{1cm} \zeta = \frac{\tau_2}{2}$$  \hspace{1cm} (12)

Equations (8) and (12) are used to design the PLL. The time-response of the system is directly proportional to $\zeta$ and therefore its value is critical for guaranteeing the system stability [5, 7]. Usually, $\zeta$ is defined as 0.707 while the rest of
variables are defined accordingly. A commercial Voltage Controlled Crystal Oscillator (VCXO) SI550 from Silicon Labs was used to generate the sampling clock. According to laboratory tests this device presents a gain $K_o$ of 99 kHz/V while, as aforementioned, the M&M TED presented a gain $K_d$ of 0.35 V/rad. Finally, the required $\omega_n$ for the system was chosen as 4 kHz. Once these parameters were defined, we proceeded to design and implement the Loop Filter.

**Digital Transformation and FPGA implementation of the Loop Filter**

In order to implement the loop filter in the FPGA it is necessary to transform it from the analogue domain $F(s)$ into the digital domain $F(z)$. This is achieved by means of a bilinear transformation, which basically maps the left side of the $s$-plane into the unit circle of the $z$-plane, thus guaranteeing that any stable system in the analogue domain is transformed into a stable digital system. The bilinear transformation is defined as [7]

$$H(z) = H(s) \Rightarrow s = \frac{2}{T_S} \frac{1 - z^{-1}}{1 + z^{-1}}$$

where $T_S$ is the sampling period.

The resulting loop filter architecture was implemented as shown in Fig. 7. The gain constants $K_1$ and $K_2$ were defined taking into account the previously defined values $\omega_n$, $K_d$ and $K_o$ and, as depicted in Fig 7, were implemented using shift registers instead of the more bulk and slower multipliers. In the end, the coefficients resulted in $K_1 = 0.00001994$ and $K_2 = 0.5569$ and, as seen, they were approximated by a binary division as $2^{-1}$ and $2^{-13}$ respectively.

![Fig. 7 Loop Filter Implementation Diagram](image)

**Digital to Analog Conversion**

The DAC was implemented in correspondence with the Xilinx application note number XAPP154 [8]. This document describes its implementation using a $\Delta\Sigma$ modulator, for which provides a template programmed in Verilog, and also the schematic for implementing the corresponding RC passive filter. The top level diagram of the system is shown in Fig. 8.

As depicted, the code provided implements a modulator that operates with a clock at 100 MHz; therefore it was necessary to modify the code to operate at 275 MHz (FPGA Clock). Also different values of resistance and capacitance were chosen to implement the external passive filter. For more details regarding this device, the mentioned application note should be consulted. After having described the Clock Recovery System, we will now proceed to present the results obtained from its validation process and from its operation as part of the fully engineered 1Gbit/s Media Converter.

**III. EXPERIMENTAL RESULTS AND DISCUSSION**

**Testing the Clock Recovery System**

A series of tests were performed to parameterize the Clock Recovery System. The first experiments were aimed to measure the holding window of the system, which is the range of frequencies for which the system is able to lock the clock, and also the jitter throughout this holding window. The experimental set up is shown in Fig. 9.

![Fig. 9 Clock Recovery Experimental Set up](image)

As detailed, it consisted on a PRBS generator, the optoelectronic transmitter (RCLED), 50 meters of PMMA SI-POF and an optoelectronic converter (A3PICs), the Media Converter (CR+EQ) implemented inside the FPGA, the external VCXO, a bit error rate (BER) tester and a real time oscilloscope. The experiment was performed as follows: first a clock frequency near to the target one was fixed, different length PRBS sequences were transmitted, the Media Converter was restarted and if the system was able to lock the clock and operate error free, then the frequency was considered within the holding window, and the jitter was measured directly from the eye diagram on the oscilloscope.
Fig. 11  Experimental Set Up for validate the Fully Engineered 1Gbit/s Media Converter

Table I lists the results obtained; in particular it should be noticed that the line rate of the system is 1.0991 Gbps which corresponds a symbol period of 0.91 ns.

<table>
<thead>
<tr>
<th>PRBS Length</th>
<th>Holding Window Range [KHz]</th>
<th>Jitter RMS [ps]</th>
<th>Jitter Peak to Peak [ps]</th>
</tr>
</thead>
<tbody>
<tr>
<td>$2^7 - 1$</td>
<td>320</td>
<td>23</td>
<td>139</td>
</tr>
<tr>
<td>$2^{13} - 1$</td>
<td>320</td>
<td>27</td>
<td>166</td>
</tr>
<tr>
<td>$2^{15} - 1$</td>
<td>320</td>
<td>27</td>
<td>161</td>
</tr>
<tr>
<td>$2^{10} - 1$</td>
<td>320</td>
<td>28</td>
<td>163</td>
</tr>
<tr>
<td>$2^{23} - 1$</td>
<td>320</td>
<td>29</td>
<td>166</td>
</tr>
</tbody>
</table>

Once the holding window was experimentally delimited, the convergence time was measured. To this end, a flag signal was generated inside the FPGA; when the error estimated by the M&M TED was bounded within certain values that indicated a state of convergence, the flag was enabled. The resulting curve obtained with a PRBS with length $2^{23} - 1$ is shown in Fig. 10, where is evident that the convergence time is directly proportional to the frequency deviation. The fact that the frequencies are negative is just a matter of nomenclature, because the reference and starting scanning point of the tracking algorithm is set to the far right limit of the holding window. Moreover, these results show that the convergence time for the chosen operating frequency is 55 ms. It should be noticed that this time can be reduced by either moving the operating frequency towards the starting scanning point or by start scanning in the vicinity of the operating frequency.

**Fully Engineered 1Gbit/s Media Converter**

A final test to validate the operation of the fully engineered 1 Gbit/s Media Converter (including the clock recovery system) with 50 meters of PMMA SI-POF was performed. The experimental set up is shown in Fig. 11. The test consisted on the full-duplex transmission of real arbitrary traffic generated by means of an Agilent N2X Router Tester. The Router Tester allowed to measure the overall delay of the system as <30μs. Moreover, the media converter presented error-free operation for transmission without extra-attenuation at the receiver, i.e. with a received optical power of -9.5 dB. Finally, the curve of the BER as function of the received optical power is shown in Fig. 12. As seen, the system guarantees a total power margin of 4 dB before FEC, which means that the inclusion of the clock recovery system to the Media Converter does not imply a penalty in terms of power optical margin when compared with the results reported on [3]. This is the most important result of the project.

**IV. CONCLUSION**

This paper presented the design and implementation of a continuous-mode & non-data aided clock recovery system for a 1Gbit/s Media Converter for PMMA SI-POF applications. In particular, it has been demonstrated how a hybrid digital-
analog PLL based on a modified Müller & Mueller TED is capable of:

- recovering synchronism from a highly distorted and attenuated 2-PAM signal without requiring any sort of pre-equalization and without incurring in a penalty in terms of optical power margin.
- achieving synchronism in 55 ms and tracking clock frequency variations while maintaining low jitter operation.

In general, it is also stated through the obtained results the achievement of a fully engineered 1Gbit/s Media Converter in full compliance with the IEEE 802.3 Gigabit Ethernet standard, and thus the complete fulfillment, for which respected to the partnership POLITO-ISMB, of the main objectives of the POF-PLUS EU Project.

ACKNOWLEDGMENT

This work would have not been possible without the help of the other partners of the POF-PLUS project.

REFERENCES


Julio Ramírez graduated in electronics engineering in 2008 at the Instituto Tecnológico de Costa Rica and received in 2012 a Ph.D. in electronics and telecommunications engineering from the Politecnico di Torino.

His main research activities are focused on the design and implementation of digital signal processing techniques for optical communication applications on FPGAs.

Antonino Nespola received the M.S. and Ph.D. degrees in electrical engineering from the Politecnico di Torino, in 1995 and 2000, respectively. From 1997 to 1998, he was a Visiting Researcher in the Photonics Laboratory of the University of California Los Angeles. From 1999 to 2003 he was Member of Technical Staff and R&D Lab Director in Corning, Milan, where he conducted research in high-speed opto-electronics. In 2003, he joined Firelli Labs, Milan, as senior researcher. He is currently Senior Researcher at ISMB. He has published over 40 journal and conference papers, and holds 3 U.S./European patents.

Stefano Straullu graduated in telecommunications engineering in 2005 from the Politecnico di Turin, Turin, Italy, with a thesis about the project, the realization and the testing of opto-electronic subsystems for packet-switched optical networks, from the PhotonLab of Istituto Superiore Mario Boella of Turin.

In 2006, he joined the Integration Testing team of Motorola Electronics S.p.A. of Turin. Since May 2009, he has been a researcher at the Istituto Superiore Mario Boella, Turin, Italy.

Since early 2012, he has been pursuing a Ph.D. in electronics and communications engineering at the Politecnico di Torino, Italy.

Paolo Savio received the M.S. degree in electrical engineering from the Politecnico di Torino, Torino, Italy, and the University of Illinois at Chicago (following the TOP-UIC exchange program) in 1999.

In 2000 he joined Accent srl, Vimercate (MI), Italy, working on integrated circuit design and verification. From 2004 to 2008 in Fondazione Torino Wireless, he was involved in technology transfer and acceleration activities for SMEs, following the development of innovative prototypes. He is currently with Istituto Superiore Mario Boella.

Silvio Abrate graduated in telecommunications engineering in 1999 at Politecnico di Torino, with a thesis about the distribution of satellite television over an in-building fiber infrastructure.

Since 2001 he was with the Optical Networks Division of Alcatel S.p.A., in Vimercate (MI). Since February 2003 he has been a Senior Researcher at Istituto Superiore Mario Boella, with the role of coordinator of the PHOtonic Technologies and Optical Networks Laboratory (PhotonLab) held by the institute in cooperation with Politecnico di Torino. Silvio Abrate is author or co-author of over 40 journal and conference papers, and holds 4 U.S./European patents.

Roberto Gaudino is currently assistant Professor at Politecnico di Torino, Italy.

His main research interest is in the long haul DWDM systems, fiber non-linearity, modelling of optical communication systems and on the experimental implementation of optical networks. Starting from his previous researches on fiber modelling, on new optical modulation formats, such as duo-binary, polarization or phase modulation, and on coherent optical detection, he is currently investigating on short-reach optical links using plastic optical fibers. He spent one year in 1997 at the Georgia Institute of Technology, Atlanta, as a visiting researcher. From 1998, he is with the team that coordinates the development of the commercial optical system simulation software OptSim (Artis Software Corp.,now acquired by RSoft Design). He has consulted for several companies and he is author or co-author of more than 80 papers in the field of Optical Fiber Transmission and Optical Networks. He has been the coordinator of the EU FP6-IST STREP project “POF-ALL” and currently is the scientific coordinator of the EUFP7-ICT STREP project “POF-PLUS”.

Stefano Straullu graduated in telecommunications engineering in 2005 from the Politecnico di Turin, Turin, Italy, with a thesis about the project, the realization and the testing of opto-electronic subsystems for packet-switched optical networks, from the PhotonLab of Istituto Superiore Mario Boella of Turin.

In 2006, he joined the Integration Testing team of Motorola Electronics S.p.A. of Turin. Since May 2009, he has been a researcher at the Istituto Superiore Mario Boella, Turin, Italy.

Since early 2012, he has been pursuing a Ph.D. in electronics and communications engineering at the Politecnico di Torino, Italy.