Cart 0

Timing Synchronization in Coherent Optical Transmission Systems

This is a continuation from the previous tutorial - an introduction to optical beams and resonators.


1. Introduction

A fundamental building block of modern coherent optical transport system is the timing recovery or timing synchronization circuit.

Recovering the transmitted clock from the received signal is a first step in recovering the data. Only when the receive-side VCO (voltage-controlled oscillator) is phase-locked to the transmit-side VCO, the other DSP functions such as equalization and carrier recovery can commence.

A typical receiver acquisition sequence will start with locking the receive VCO, followed by blind equalization (such as the constant modulus algorithm, CMA), and then finally carrier phase recovery using equalized data.

After system acquisition, the timing synchronization circuit must operate continuously and robustly as long as a valid signal is present at the input to the receiver.

Any slip of the recovered phase or momentary loss of lock will result in catastrophic data failure and reacquisition of the equalizer and other control loops that depend on synchronous data.

In a connected network, it triggers reframe of multiple nodes downstream, and enormous amount of data is lost. For this reason, timing synchronization is often designed to be the most robust control loop in the receiver.

Timing synchronization is a topic that has been studied extensively in almost all areas of digital transmission. There is an abundance of literature on topics related to phase detectors and phase lock loops stemming from early developments in voiceband digital modems.

Yet this fundamental building block is studied again and again in every corner of digital transmission, and in every generation of new technology and design. The reason is that there can be many variables that impact the design of timing synchronization, such as

  • The type of signal and the type of distortion that the signal experiences as it goes through the communication channel
  • The amount of noise that is present
  • The sources of jitter and jitter tracking requirements
  • The environment in which it is implemented
  • The spectral occupancy of the signal; and the modulation format.

These factors are all important in that they affect not only the design of timing recovery, but also the resulting robustness and jitter performance.

Fiber-optic transmission systems have evolved tremendously since the late 1990s. Modulation formats since then have evolved from intensity modulation and direct detection (IMDD) to QPSK format with differential detection, and then to polarization-multiplexed QPSK (Pol-Mux QPSK) and polarization-diverse coherent detection.

The first commercialization of Pol-Mux QPSK and coherent detection was at 40 Gbit/s data rate and was reported in 2008. Compared with IMDD systems, the modern coherent receiver receives signals of a much different quality.

For instance, the signal received on each polarization is a linear combination of two independently modulated signals. Within each received polarization, the interference from the other polarization is just as strong as the signal itself.

In addition, the signal is also dispersed by large amount of chromatic dispersion (CD). Because of the paradigm change in the method of modulation and detection, the timing synchronization block is impacted in a fundamental way. Timing synchronization methods developed in IMDD systems cannot be directly applied to modern coherent systems.

The goal of this tutorial is to describe some of the methods and techniques used to recover timing in modern coherent systems. To do so, each section of this tutorial is devoted to a particular aspect of timing recovery.

  • The first section describes the overall system environment.
  • The next section is devoted to jitter penalty and jitter sources in coherent optical systems, with particular emphasis on jitter generated from FM noise of the laser and digital dispersion compensation.
  • The next section describes different types of phase detectors and their jitter performances.
  • As described elsewhere in this tutorial series, propagation in the single-mode fiber produces two important distortion effects that require significant amount of effort in equalization, namely chromatic dispersion (CD) and polarization-mode dispersion (PMD). These distortions are also harmful to phase detectors. Their effects and their remedy are discussed in the last sections of this tutorial.

Timing recovery is usually implemented as feedback second-order phase-locked loops (PLLs). The principle is the same for all feedback control loops. The receive VCO is assumed to have a tuning port for feedback tuning of the phase. The second-order loop filter is a standard component that does not depend on input signal properties.


2. Overall System Environment

The overall system diagram is shown in Figure 10.1.


Figure 10.1. Overall system diagram showing Tx and Rx VCOs.


On the left side, the Tx digital signal processor (DSP) filters the input data and up-samples the data to a sampling rate determined by the D/A (digital-to-analog) converter.

Four D/A converters drive a pair of I/Q modulators that modulates the in-phase and, quadrature components of both TE and TM polarizations of the light. Note that the D/A converters at the Tx are clocked using a free running oscillator (Tx VCO).

The VCO is typically very high frequency (e.g., 16 GHz) in order to meet tight jitter requirements dictated by the high sampling rate of the D/A and A/D converters (e.g., 64 GSamples/s, and 90 GSamples/s).

Since the digital circuits implemented in the Tx DSP can only operate at 100s of MHz (e.g., 500 MHz), the VCO output is divided down by an integer ratio \(N\) (e.g., 16G/32=500 M) to be used to clock all the digital circuits.

Essentially, the RF signal output of the D/A converter and hence the optical signal is modulated at a symbol rate (a.k.a. baud rate) that is synchronous with the frequency of the Tx VCO.

The symbol rate is exactly defined by the frequency of the Tx VCO scaled by a constant integer factor. At 32G symbol rate and 16 GHz VCO, that integer factor is 2. Therefore, any jitter or frequency variation of the Tx VCO is directly modulated onto the light going out of the transmitter as baseband delay.

Instantaneously, if the Tx VCO phase is advanced or retarded by 1 ps, then the modulated envelop carried by the 192 THz optical carrier is also advanced or retarded by 1 ps.

The characteristics of this time-varying phase wander of the VCO will be discussed in more detail in the next section.

Once down converted to baseband by a coherent receiver, the advancement or retardation of the VCO manifests as a positive or negative delay on the baseband electrical signal. If the delay is significantly large compared with the symbol period, then significant distortion results at the decision device and system performance will degrade.

The function of the clock recovery in the Rx side is to track out this phase variation of the Tx VCO by high-speed tuning of the Rx VCO. However, the remaining jitter still has system impact, and this is also discussed in detail in the next section.

On the Rx side, a local oscillator (LO) beats with the input signal to produce in-phase and quadrature components of the TE, and TM polarizations of the received light.

These four components are sampled by four A/D (analog-to-digital) converters. They feed the sampled data to an Rx DSP circuit that will filter the input signal to perform functions such as equalization and carrier recovery.

Somewhere in the signal processing chain, a phase detector uses the sampled data and produces a digital output that is proportional to the delay of the input signal.

The delay or timing error feeds a loop filter that typically implements a second-order PLL. The loop filter output drives the tuning port of the Rx VCO.

Positive and negative changes to the input of the Rx VCO directly translate to advancement or retardation of the sampling phase of the A/D converters and that affects the delay of the digital signal input to the Rx DSP. This forms a feedback loop.

In previous generations of IMDD transmission systems where the signal arrives in the receiver with minimal distortion, the phase detector and clock recovery are implemented as analog components.

In contrast, modern coherent receiver uses A/D converters and digital signal processing, phase detectors and loop filter components are all digital implementations.


3. Jitter Penalty and Jitter Sources in a Coherent System

3.1. VCO Jitter

Like in most digital transmission systems, the first and foremost source of jitter is the VCO themselves.


Figure 10.2. Rx VCO clock generation.


The clocking architecture shown in Figures 10.1 and 10.2 is representative of state-of-the-art transmission system at 100G rate. The Rx VCO component in Figure 10.1 is actually a PLL shown in Figure 10.2.

The internal VCO is designed to naturally oscillate at 16 GHz; however, temperature and CMOS process variations cause the VCO to power up with very large frequency error. This frequency error can be as large as ±10%, making clock recovery very difficult. The solution is to provide an external crystal oscillator as a stable reference.

In Figure 10.2, the crystal reference is at 2 GHz shown on the bottom left as an input signal. The 16G VCO is always phase-locked to the 2 GHz reference using a standard phase detector and loop filters. A similar PLL is also used in the transmitter to generate a stable 16 GHz clock.

Since each crystal reference in the Tx and Rx is very stable within tens of PPM over life, locking the Rx VCO to the Tx VCO requires a pull-in range of only tens of PPM. This greatly simplifies the task of clock phase synchronization.

Typically, a tuning port is provided, so that the VCO can be tuned away from its lock point by the control of the phase detector and loop filter components inside the Rx DSP. The VCO, PLL, and A/D converters are analog components that are built inside a coherent receiver ASIC.

The jitter of the Rx VCO is derived from the phase noise spectrum of the Rx VCO. The spectral purity of the Rx VCO can be defined as the phase noise spectrum of the 16 GHz output clock measured in a spectrum analyzer, while the VCO is phase-locked to the 2 GHz reference, and with no tuning from the DSP.

The 16G output clock directly drives the sampling aperture of the A/D converter and, therefore, directly impacts system performance.

Let the VCO center frequency be \(f_0\), and let the phase noise of the VCO be modeled as a single sinusoidal phase modulation of amplitude \(\alpha\) in radians “\(\phi(t)\).” The voltage of the VCO can be expressed as \(V(t)\).


If the phase modulation \(\phi(t)\) is small, then the \(V(t)\) can be simplified (\(\cos(\phi(t))=1,\sin(\phi(t))=\phi(t)\)), and using trigonometric identities, \(V(t)\) can be written as the following three terms that correspond to three spectral lines.


Spectrally, the phase modulation generates two tones at \(\pm{f_m}\) from the center frequency \(f_0\). When integrated in a spectrum analyzer, the two side tones sum into a single tone at \(f_m\) away from center frequency \(f_0\).

The ratio of the modulation at \(f_m\) compared with the carrier at \(f_0\) is defined as the carrier to modulation ratio expressed in dBC, which is commonly known as the single-sided phase noise spectrum.

This quantity is shown in Equation 10.3 by summing the powers of the two modulation tones divided by the power of the carrier.


The phase modulation in units of radians RMS can be expressed as a function of dBC.

\[\tag{10.4}\text{Jitter}[\text{radians RMS}]=\alpha/\sqrt{2}=\sqrt{10^{\text{dBC}/10}}\]

The absolute jitter in picosecond RMS can be defined in terms of \(\alpha\), and by scaling of the center frequency \(f_0\).

\[\tag{10.5}\text{Jitter}[\text{ps RMS}]=\frac{\sqrt{10^{\text{dBC}/10}}}{2\cdot\pi\cdot{f_0}}\times10^{12}\]

An illustrative phase noise spectrum of a VCO is shown in Figure 10.3 in units of dBC/Hz versus frequency offset from center frequency \(f_0\). To calculate the total jitter between 10 and 100 MHz, the above-mentioned equation is used with the phase noise density at \(-120\) dBC/Hz integrated over 90 MHz of frequency using center frequency \(f_0\) of 16 GHz. This gives 0.0944 ps RMS of jitter in that part of the spectrum.


Figure 10.3. Rx VCO phase noise spectrum.


Overlaid on Figure 10.3 is a dotted line that represents the high-pass filtering function of a clock recovery loop operating at about 1MHz loop bandwidth (BW). The clock recovery loop essentially operates as a high-pass filter, rejecting phase noise at frequencies slower than the loop BW.

The VCO phase noise spectrum integrated over the first-order high-pass function represents the untracked jitter, and this directly impacts system performance.

In order to minimize this jitter source, the clock recovery loop bandwidth needs to be as high as possible.

Aside from the implementation difficulties associated with a very high loop BW, the power of the phase detector jitter (or self-noise) increases linearly with loop BW, and therefore limiting the loop BW to a range of a few megahertz.

Phase detectors and the jitter that they produce are discussed in a later section. The overall minimum jitter and optimum loop BW is achieved when the untracked VCO jitter balances out the detector jitter.

One should note that in practice, the operating point may not be at the optimum due to other reasons such as round-trip loop latency that may limit the loop BW on the high side, and specific phase transient tolerance may limit the loop BW on the low side.


3.2 Detector Jitter Definitions and Method of Numerical Evaluation

The performance of a phase detector is directly defined by the jitter that it produces. The source of this noise is inherent in the data itself.

The modulated data itself is random, with phase error imparted on the modulated signal. The detector must estimate the phase error in the presence of modulated data, which produces estimation noise.

Theoretical analysis and derivation of detector noise in some detectors can be done resulting in analytical solutions, but in more complicated detectors, or with signals that have distortion, analytical methods become very difficult.

Often one resorts to simulation to numerically evaluate the amount of noise the detector produces. This section is devoted to the methodology used in comparing two different types of phase detectors.

The detector output at a given delay error “\(\tau_\text{err}\)” can be written as shown in Equation 10.6, where the symbol time interval is denoted as “\(\tau_\text{UI}\)”, detector strength is denoted as “\(K_d\)”, and detector noise “\(n[i]\)” can be modeled as additive white Gaussian with zero mean and variance \(\sigma^2\), and “\(i\)” denotes time index.

The time between successive evaluations of the phase detector output is the inverse of phase detector rate “\(R_\text{PD}\).” Here, we assume a detector having sinusoidal output with period equal to one symbol time interval. For 32 Gbaud symbol rate, \(\tau_\text{UI}\) is 31.3 ps. In some other detectors, the detector period is half of \(\tau_\text{UI}\), in which case it needs to be factored into calculation.


The expected value of \(P\) is plotted on the vertical axis in Figure 10.4.


Figure 10.4. Typical detector sensitivity curve.


What is relevant to the detector performance is the ratio of the power of the detector noise \(\langle{n^2}\rangle=\sigma^2\) compared with the detector strength \(K_d\). Furthermore, to calculate how much detector jitter will be present with closed-loop BW (BW\(_\text{Loop}\)) of 1 MHz, one must also factor in the rate at which the random variable \(P[i]\) is generated (\(R_\text{PD}\)).

The amount of detector jitter present when the PLL is in closed loop can be approximated as follows:

\[\tag{10.7}\text{Jitter}[\text{ps RMS}]=\frac{\sigma}{K_d^"}\sqrt{\frac{\text{BW}_\text{LOOP}}{R_\text{PD}}}\]

\(K_d^"\) is defined as the slope of the sine function in Figure 10.4 at \(\tau_\text{err}=0\), which is the lock point of the PLL. Taking derivative of the sine function, \(K_d^"\) can be defined as a function of \(K_d\) and \(\tau_\text{UI}\) [ps] in Equation 10.8.


The equation shows that as loop BW is reduced by a factor of 2, the detector jitter standard deviation is reduced by \(\sqrt{2}\).

When comparing different detectors, the jitter value must be normalized to a common closed-loop BW by taking into account the rate of the phase detector.

For example, frequency-domain phase detector output is generated every FFT clock cycle, which is in hundreds of megahertz, whereas a time-domain phase detector output can be generated every symbol, and so \(R_\text{PD}\) are very different between the two.

Therefore, it is important to normalize the jitter to a closed-loop BW of 1MHz as an example.


3.3. Laser FM Noise- and Dispersion-Induced Jitter

In a coherent receiver where a large amount of chromatic dispersion (CD) is compensated digitally, there is an additional timing jitter source that is unique to the coherent receiver and did not exist in traditional direct-detection systems. It is related to the convolution of FM noise of the local laser (LO) and the amount of CD compensated in the digital receiver.

Consider the system diagram in Figure 10.5, where the Rx (LO) laser is offset from the Tx laser by an amount of frequency \(\omega_0\).

Dispersion in the fiber is perfectly compensated by the digital dispersion filter in the receiver. In a process that is discussed later, this produces a measurable delay on the signal at the output of the dispersion compensation filter.

Note that if \(\omega_0\) is absent in the model, then the digital compensation is perfect. However, laser always has instantaneous frequency variations characterized by a linewidth parameter as well as low-frequency FM noise. “\(\omega_0\)” is in general a stochastic quantity unknown at the receiver.

The presence of Rx laser phase noise convolved with digital dispersion compensation produces a well-documented noise source. The effect results in bit error rate degradation in a coherent transmission system. Here we consider another aspect of the same problem, reflected as a new stochastic jitter source for coherent systems.


Figure 10.5. System diagram illustrating Rx laser and dispersion compensation.


Consider the frequency-domain representation of the received digital signal after A/D converter as written in the following equation. The transmitted signal is represented as \(H(\omega)\). The received signal contains a static LO frequency offset (\(\omega_0\)). The fiber dispersion is a group delay distortion and is modeled as multiplication in frequency domain by a parabolic phase with coefficient “\(\kappa\).”


The coefficient “\(\kappa\)” is related to the fiber dispersion coefficient \(D\) (ps/nm/km), fiber distance \(L\) (km), wavelength of the light \(\lambda\) (nm), and speed of light \(c\) (m/s) according to the following:


The carrier recovery function in the receiver can extract instantaneous frequency that is imparted on the signal, but it cannot separate the Rx laser contribution from the Tx laser contribution.

According to Figure 10.5, Rx laser frequency variations are the source of the problem. Therefore, practically, the only choice for the receiver is to compensate for dispersion assuming a frequency offset of zero.

The following equation describes the signal after dispersion compensation.

\[\tag{10.11}\underbrace{H(\omega-\omega_0)e^{j\kappa(\omega-\omega_0)^2}}_{\text{Received signal}}\cdot\underbrace{e^{-j\kappa(\omega)^2}}_{\text{Dispersion compensation}}=H(\omega-\omega_0)\cdot\underbrace{e^{-j(2\kappa\omega_0)\omega}}_{\text{Induced delay or jitter}}\cdot{e}^{j\kappa\omega_0^2}\]

The group delay distortion is removed, but the resultant signal is frequency shifted by \(\omega_0\) (to be compensated by carrier recovery), a phase offset term \(\kappa\omega_0^2\) (also compensated by carrier recovery), and a delay term remaining that is a linear function of both the amount of frequency offset and the dispersion (\(\tau=2\kappa\omega_0\)).

As the frequency of the LO (centered at 193.4 THz) is slowly varying, this delay is also time varying. At 50,000 ps/nm of compensation, the conversion from FM noise to timing jitter has efficiency of ∼0.4 ps/MHz. If the frequency of the LO varies up to ±50 MHz, that translates to ±25 ps of jitter. If un-tracked, this jitter is detrimental to a 32 Gbaud system.

One should note that this is a low-frequency effect. If the LO FM modulation rate is fast compared to the time-span of the CD compensation filter, then the phase fluctuation of the input data is convolved into noise seen at the output of the CD filter, and this causes a well-documented system penalty.

If the FM modulation rate is static compared to the CD filter, then timing jitter is produced. Once tracked out by the clock recovery circuit, this causes no system penalty.

For a 32 Gbaud transmission rate, the time-span of a CD compensation filter may be on the order of 10 ns, and thus for FM modulation rates significantly slower than 100 MHz, timing jitter can be produced.

While jitter less than 1MHz can be efficiently tracked out by the clock recovery loop, FM rates between 1 and 100MHz is a sensitive region, where the LO laser FM noise must be carefully controlled.

The conversion of FM noise to timing jitter has been studied in literature. Recasting their results, a jitter conversion efficiency (“JCE”) can be defined as the amount of jitter induced in units of picoseconds per MHz of Rx laser frequency excursion (ps/MHz).

The JCE is a function of FM modulation rate “\(f\),” and it can be written as shown in the following equation. A parameter \(\Delta\tau\) is introduced that is inversely related to the cutoff frequency \(f_\text{CUTOFF}\).

The frequency-dependent JCE can be written as a function of the DC conversion efficiency JCE\(_0\), and parameter \(\Delta\tau\). The parameter \(\Delta\tau\) is a function of the baud rate of the data under modulation and dispersion in units of ps/nm. The DC conversion efficiency (JCE\(_0\)) is only a function of the dispersion.


The JCE function is plotted in Figure 10.6 against Rx laser FM modulation rate for symbol rate of 32 GHz, and dispersion of 50,000 ps/nm.

At frequency components below the cutoff frequency \(f_\text{CUTOFF}\) (∼80 MHz), the jitter conversion is efficient at ∼0.4 ps/MHz.

At higher frequencies, the phase jitter interacts with the dispersion compensation filter in the digital receiver and is converted into an additive noise process.

Overlaid on top in Figure 10.6 are simulated values (squares) using 32 Gbaud binary toggle sequence as the modulated data, going through 50,000 ps/nm of dispersion and with sinusoidal FM modulation on the Rx laser.


Figure 10.6. Frequency-dependent jitter conversion efficiency (JCE) at 50,000 ps/nm and 32G symbol rate.


3.4 Coherent System Tolerance to Untracked Jitter

For coherent systems using fractionally spaced equalizer and clock recovery, the VCO jitter is tracked out mainly by the clock recovery loop since the clock loop generally has much wider loop bandwidth compared to the equalizer.

As a result, the residual untracked jitter is quite fast. For instance, if a 1MHz clock loop BW is implemented, then untracked jitter is faster than or equal to 1 MHz. The equalizer tracking speed is quite slow compared to this rate, and this can be due to a variety of reasons: loop latency, number of taps, ASIC resource etc.

The effect of the equalizer is negligible in determining system jitter tolerance, and it is ignored in this analysis. Instead we focus on performance impact due to residual untracked jitter from the clock recovery loop.

The tolerance of a system to untracked jitter strongly depends on the symbol rate. In general, 1 ps of jitter affects BER at 32G symbol rate same as 2 ps of jitter at 16G symbol rate.

In usual practice, the system BER performance is evaluated against jitter normalized to the symbol period. The symbol period is denoted as "UI" or unit-interval.

A system’s tolerance to untracked jitter is then a strong function of the following three parameters:

  1. System pulse shape
  2. FEC coding gain
  3. Modulation format.


Figure 10.7. Three examples of pulse shapes used in optical transmission, each having different jitter tolerance and spectral occupancy.


Figure 10.7 shows three eye diagrams. The one on the left is generated using raised-cosine filter with roll-off factor \(\alpha=0.01\). Spectrally, this pulse shape generates a signal with near brick-wall power spectral density with excess bandwidth only 1% more than the minimum Nyquist bandwidth.

The power spectral density is plotted beneath the eye diagram. It is ideal for dense frequency multiplexing of multiple channels in order to achieve high spectral efficiency.

The eye diagram in the middle figure is generated using roll-off factor \(\alpha=1.0\). Spectrally this pulse shape occupies twice the minimum Nyquist bandwidth, and therefore it is spectrally less-efficient.

The eye diagram on the right figure illustrates a non-return-to-zero (NRZ) pulse shape with practical rise/fall time. Spectrally, the signal has strong component at 1.5× the symbol rate (around the middle of the first side lobe).

Plotted above these figures are three Gaussian probability density functions of the same variance, representing jitter in each system. It is easy to see that NRZ pulse shape is most tolerant to jitter, while the most spectrally efficient system is the least tolerant.

This is because as the sampling phase is moved away from the optimum point, the sampled signal does not reduce in amplitude in the NRZ pulse shape due to a very wide flat top.

In contrast, for the Nyquist pulse shape, the sampled signal quickly reduces in amplitude, and therefore also signal-to-noise ratio (SNR). For Nyquist systems that deliver the highest fiber capacity, jitter and its system impacts remains a problem that requires careful design and component specification.

Historically, in un-amplified transmission systems that pre-date 1995, forward-error- correction (FEC) technology had not made its way into optical transmission. This was largely due to difficulties in implementing complex decoding algorithms at high data rates using CMOS technology at the time.

In un-coded transmission systems, jitter is one of the dominant sources of BER degradation. Since the advent of EDFAs and the wide spread adoption of first, second and third generation of FEC, system degradation due to jitter is an increasingly small portion of the overall degradation budget.

The dominant source of BER degradation is the noise accumulated in the repeatedly amplified fiber transmission line itself, going over tens and hundreds of spans of fiber all over the world. Jitter tolerance of a system depends on the FEC employed in the system.

In modern coherent systems, the use of higher-order modulation formats allows higher spectral efficiency and fiber capacity. However it comes at the cost of reduced Euclidean distance between neighboring symbols, and therefore reduces the system’s tolerance to noise of all types, including jitter. A 64QAM modulation is a lot more sensitive to jitter than QPSK modulation.

Figure 10.8(a) and (b) exemplifies system impact of all three of these effects. QPSK, 16QAM and 64QAM modulations are analyzed using simulation and brute-force error counting.

Two pulse shapes are considered: roll-off factor \(\alpha=0.01\) and \(\alpha=1.0\). Two different FEC schemes are considered: a 7% overhead hard decision FEC with BER threshold of 0.001, and a 20% overhead soft decision FEC with BER threshold of 0.015.

In Figure 10.8(a), QPSK and 16QAM formats are shown where signal-to-noise ratio (SNR, measured in symbol-rate bandwidth) penalty at the FEC threshold is plotted against jitter in units of UI RMS.

Pulse shape with roll-off factor of 1.0 and with 20% soft FEC tolerates the most amount of jitter (see black diamond curve). Roll-off factor of 0.01 with 7% hard FEC tolerates the least amount of jitter (see dotted circles).

The same trend is true for 16QAM and 64QAM formats (Figure 10.8(b)). QPSK is the most tolerant format and 64QAM is the most sensitive of the three formats.


Figure 10.8. (a) Jitter tolerance of QPSK and 16QAM formats with different combinations of pulse shapes and FEC schemes. (b) Jitter tolerance of 64QAM formats with different combinations of pulse shapes and FEC schemes. 


4. Digital Phase Detectors

With the use of A/D converters and extensive signal processing for compensation of linear distortions, the phase detector is most naturally implemented in the digital domain inside the Rx DSP as shown in Figure 10.1.

This section focuses only on digital phase detectors using digitally sampled signals. Furthermore, the A/D converters are practically always over-sampling. The sampling rate is always higher than the symbol rate.

Although theoretically it doesn’t have to be, this feature is necessary in order to implement perfect compensation of dispersion and PMD using FIR filter lengths that are within practical bounds.

Therefore, we have at our disposal signals that are over-sampled in the DSP and equalized symbols at the output of the DSP available at the symbol rate. This section reviews two types of phase detectors based on these two types of signals that are readily available inside the DSP.

The first class of phase detector is based on a direct measure of the clock phase using signal components above and below the Nyquist frequency (\(f_b/2\), where \(f_b\) is the symbol rate).

Examples of this type of phase detector include the conventional analog squaring phase detector, Gardner’s phase detector, Godard’s pass-band timing detector, and a frequency-domain phase detector.

In contrast, decision directed algorithms like Mueller–Muller, minimum mean-squared error and probabilistic algorithms like maximum-likelihood do not use frequency components above Nyquist, and are referred to as the second class of phase detectors.

In the description of the first class of phase detectors, we first introduce a frequency-domain phase detector and show how it is able to detect the clock phase.

Then, we demonstrate that the analog squaring phase detector, Gardner, Godard, and the frequency-domain phase detectors are in essence equivalent to each other, in that they all use frequency components above and below the Nyquist frequency to measure directly the clock phase from the input signal.


4.1 Frequency-Domain Phase Detector

Let the transmitted data symbols (\(a[i]\)) be zero mean, complex, and random. In the case of QPSK, \(a[i]\) is uniformly distributed on an alphabet of four phases (0\(^\circ\), 90\(^\circ\), 180\(^\circ\), 270\(^\circ\)).

After modulation and transmission, the signal is sampled by a set of A/D converters that sample at two samples per symbol. We assume transmission of a periodic sequence \(a[i]\) in order to define signals in the frequency domain.

The principles derived can be directly applied to aperiodic signals for a system that carries live traffic. A/D converters can sample slower than two samples per symbol, but it must sample at a rate faster than one sample per symbol.

The sampled data \(x[n]\) has the length \(N\), and its frequency-domain representation \(X[k]\) can be written as the following equation.

Sampling rate of two samples per symbol is assumed throughout our derivation, and the length of \(a[i]\) must be \(N/2\). \(A[k]\) is the \(N\)-point FFT of time sequence \(a'[m]\), where \(a'[m]\) is 2× up-sampled from \(a[i]\) by zero insertion. Zero insertion can be defined as the following: \(a'[0, 2, 4, …] = a[0, 1, 2, …]\), and \(a'[1, 3, 5, …] = 0\).

The channel response is defined as \(H[k]\) contains clock phase error \(\tau\) (normalized to symbol period \(1/f_b\)), and the system symbol rate is \(f_b\).

\(H[k]\) can be modeled by a simple amplitude weighting function \(H'[k]\) with clock phase error.


Because of the up-sampling, \(A[k]\) repeats in spectrum according to sampling theorem. The frequency components of \(A[k]\) between \(f_b/2\) and \(f_b\) (\(N/4\) to \(N/2\)) are the same as those components between \(-f_b/2\) and 0 (\(-N/4\) to 0). The data spectrum is periodic with period \(f_b\) (or \(N/2\)). This important consequence of sampling theorem is pictorially illustrated in Figure 10.9.


Figure 10.9. Illustration of received signal spectrum showing correlated frequency components.


Because the frequency components of the sampled data \(A[k]\) are correlated across \(f_b\), there is opportunity to estimate the phase difference in the received signal \(X[k]\) across \(f_b\) (or across \(N/2\) samples in frequency).

Consider the frequency component at \(k_o\), estimating the clock phase from that single-frequency component can be defined as:


where ⟨⋅⟩ denotes time averaging, Im{⋅} is the imaginary component, \(\ast\) denote complex conjugation. Substitute \(X[k]\) as \(A[k]\) multiplied with \(H[k]\):


Realizing that \(A[k_o]\) is equal to \(A[k_o-N/2]\) from sampling theorem, and substitute \(H[k]\) as an amplitude weighting function \(H'[k]\) and clock phase error \(\tau\) (normalized to symbol period):


The derivation shows that Equation 10.14 produces a phase estimate that is linearly proportional to sine of the clock phase error. The proportionality constant is related to the magnitude of the channel response \(H'[k]\) at frequencies \(k_o\) and \(k_o-N/2\).

It is also important that the transmitted random data \(A[k]\) at frequency component \(k_o\) should have non-zero power. This is usually guaranteed since the modulated data \(a[i]\) is assumed random and has white power spectral density.

In order to reduce the detector noise, all frequency components can be used. Here the expectation is dropped, and the averaging is done through summation over all \(N/2\) frequency-domain samples.


This detector relies on using correlated spectral components around \(\pm{f_b}/2\) to yield a phase estimate. The following figure shows the placement of this phase detector in the Rx DSP.


Figure 10.10. Placement of the frequency-domain phase detector inside Rx DSP.


The phase detector can use frequency-domain samples after an \(N\)-point FFT block. FFT and I-FFT engines are typically used to efficiently compensate for dispersion using overlap-and-add (OLA) or overlap-and-save (OLS) algorithm, and therefore we have readily available frequency-domain samples.

The size of the FFT (“\(N\)”) is related to the dispersion compensation range. Figure 10.10 shows the possibility to use frequency data from both polarizations \(X[k]\) and \(Y[k]\) and sum the results together to reduce detector jitter.

In a later section, we shall show that it is not as simple as directly adding phase detector outputs on both polarizations in this way. With fiber impairment known as Polarization Mode Dispersion (PMD), data from two polarizations must be combined differently.


4.2. Equivalence to the Squaring Phase Detector

The squaring phase detector is used extensively in recovering symbol rate timing wave for signals of NRZ type spectral shaping. In this section, we show that this commonly used phase detector has working principles that are equivalent to the frequency-domain phase detector described in the previous section.


Figure 10.11. The squaring phase detector extracting timing wave for decoding data.


Figure 10.11 illustrates the conventional squaring method that generates the timing wave using an input signal \(x(t)\). The timing wave or the extracted clock is used by the decision device to recover data from the input signal.

As shown in the diagram, the signal is first squared and is then filtered by a narrow-band band-pass-filter (BPF) centered at the symbol rate frequency \(f_b\). The output is a symbol rate timing wave that recovers the timing of the input signal. In the figure, the squaring device is replaced with absolute squared device in anticipation of complex signal \(x(t)\).

To simplify analysis, let’s consider a periodic signal \(x(t)\). If \(X(f)\) is the frequency-domain representation of \(x(t)\), then the frequency-domain representation of \(x(t)\times{x}^*(t)\) can be written in terms of \(X(f)\) using the Fourier multiplication and conjugation property (\(\otimes\) denote convolution).


The band-pass filter is a filter with real-valued input signals. The extraction of the \(f_b\) component by the BPF is equivalent to the evaluation of \(\text{Re}\{Y(f=f_b)\}\).

Assuming a very narrow-band filtering, and by using the convolution integral, the output of the BPF can be written as:


Comparing Equation 10.19 to Equation 10.17, we see that they are essentially the same. Equation 10.19 is the continuous-time version of the discrete-time formulation of Equation 10.17.

One difference is that Equation 10.19 extracts cosine of the clock phase error (Re{⋅}), while Equation 10.17 extracts the sine of the clock phase error (Im{⋅}). This difference leads to a lock point difference of 90\(^\circ\) or 1/4 of the symbol period. This can be adjusted by delaying the output timing wave.

Therefore, similar to the frequency-domain phase detector, the commonly used squaring phase detector must also use correlated frequency components above and below the Nyquist frequency.


4.3. Equivalence to Godard’s Maximum Sampled Power Criterion

Let us hypothetically consider a receiver that uses A/D converter sampling at one sample per symbol. Such a receiver will be sensitive to clock phase error. For optimum performance with such a converter, the sampling phase should be chosen to maximize sampled signal power, which then maximizes sampled signal to noise ratio.

A clock phase detector derived using this criterion will also work for a receiver system with sampling rate of two samples per symbol, in which one of the two samples will contain the highest sampled signal power. This criterion was adopted by Godard in his search for an optimum phase detector.

To describe the maximum sampled power criterion, let the transmitted data symbols “\(a[i]\)” be zero mean, complex, and random. After modulation and transmission, the signal is sampled by a set of A/D converters at a sampling rate of two samples per symbol.

Assume the digitally sampled data “\(x[n]\)” is periodic with period \(N\), and its frequency-domain description is \(X[k]\) (\(k = 0, 1, … ,N-1\)). The even samples of \(x[n]\) are \(T\)-spaced sampled data (\(x_e[n]\), \(n = 0, 1,…, N/2-1\)), and its frequency-domain description (\(X_e[k]\), \(k = 0, 1, …, N/2-1\)) can be written in terms of \(X[k]\), where sampling theorem is invoked.


The power of the \(T\)-spaced sampled signal \(x_e[n]\) can be calculated in the frequency domain by Parseval’s theorem (\(\ast\) denote complex conjugation):


As was discussed earlier, the received signal spectrum \(X[k]\) contains the original zero-inserted, random data sequence spectrum \(A[k]\) multiplied with the channel response \(H[k]\).

\(H[k]\) consists of a real-valued amplitude weighting function \(H'[k]\) and clock phase error \(\tau\). Similar to that mentioned previously, \(\tau\) is the channel delay (or the clock phase error) normalized to the symbol interval (\(1/f_b\)).

The goal is to maximize the sampled power of the signal \(x_e[n]\), with respect to \(\tau\). Therefore, in the product expansion of the above-mentioned equation, the two terms that are not a function of \(\tau\) will be neglected (\(X[k]\times{X}^*[k]\) and \(X[k+N/2]X^*[k+N/2]\)), and the two cross terms will be retained in the analysis that follows:


Substitute \(X[k]=A[k]\times{H'}[k]e^{-j4\pi{k}\tau/N}\) (\(H'[k]\) is assumed real value):


Recalling that \(A[k]\) is periodic with period \(f_b\) (or \(N/2\)), and \(A[k]=A[k+N/2]\). The equation is simplified to


The equation shows that maximizing the sampled signal power (the even samples) is the same as zeroing the clock phase error \(\tau\). At \(\tau=0\), \(\cos(2\pi\tau)\) is maximized. The four terms in the summation are properties of the input random data sequence and channel magnitude response. Furthermore, minimizing \(\tau\) is the same as driving \(\sin(2\pi\tau)\) to zero. Combining the results of Equations 10.22 and 10.24


It is then easy to see that the following equation is also true, where the real component of the beating of \(X[k]\) and \(X^*[k + N/2]\) extracts the cosine of the phase error and the imaginary component extracts the sine of the phase error:


Equations 10.17 and 10.26 are essentially the same. This proves that the lock point of the frequency-domain phase detector of Equation 10.17 is optimum in the sense that the sampled data contain the highest sampled signal power and maximize the sampled signal-to-noise ratio. Godard’s phase detector also has the same working principle as the frequency-domain phase detector.


4.4 Equivalence to Gardner’s Phase Detector

The Gardner’s phase detector is also a very commonly used phase detector. In what follows, the digital Gardner’s phase detector in the time domain is shown. “\(x[n]\)” (\(n = 0, 1, 2,…, N - 1\)) is the complex discrete time data sampled at two samples per symbol. “\(x_R[n]\)” and “\(x_l[n]\)” are the real and imaginary components, respectively.


The equation can be written in terms of complex quantity \(x[n]\) and its conjugate:


Similar to earlier derivations, we first define periodic sequences for the transmitted and received data; therefore, we can represent signals easily in the frequency domain.

The frequency-domain representation of the received samples \(x[n]\) is defined as \(X[k\)]. \(x[n]\) and \(X[k]\) are periodic with period equal to \(N\). “\(x[2n]\)” is a 2× decimated version of \(x[n]\). It represents the even samples of the received data.

It can be represented in the frequency domain using \(X[k]\) as shown in Equation 10.29. Sampling theorem is invoked, and the double arrow indicates transformation from time domain to frequency domain.


The differencing function in Equation 10.28 \((x[2n - 1] − x[2n + 1])\) can be thought of as two steps. First, the signal \(x[n]\) is filtered by a differencing filter, and then down-sampled by a factor of two. The filtering function is a differencing filter with impulse response \([1\;\;0\;-1]\) . It has magnitude and phase response as shown in Figure 10.12.


Figure 10.12. The differencing filter in Gardner’s phase detector.


We note that in order to derive and show the principle of the phase detection process, the magnitude response of this filter can be ignored. The phase response of \(j\) in the upper-side band and \(-j\) in the lower-side band is important. The magnitude response will affect detector sensitivity. For the sake of simplicity, we approximate this differencing filter with \(H[k]\), where \(H[k]\) has a flat magnitude response and only the phase response is retained.

\[\tag{10.30}H[k]=\begin{cases}+j\quad\text{where }k=0\rightarrow{N/2-1}\\-j\quad\text{where }k=N/2\rightarrow{N-1}\end{cases}\]

Using Equation 10.30 and sampling theorem illustrated in Equation 10.29, the differencing function in Equation 10.28 can be written in the frequency domain through the following derivation:


We note another discrete time Fourier property (assume \(A[k]\) is the FFT of \(a[n]\), and \(B[k]\) the FFT of \(b[n]\)):


Using Equations 10.29, 10.31, and 10.32, we derive the frequency-domain equivalent function of Equation 10.28:


Using \(H[k]\) definition in Equation 10.30, the frequency-domain phase detector can be simplified:


Multiplying out the product terms, and removing the quadrature components, we show that this equation can be equivalently written as


This shows that the Gardner’s phase detector under the simplification of the differencing filter \(H[k]\) (represented by Eq. 10.30) is equivalent to the frequency-domain phase detector introduced earlier in Equation 10.17, and that it also uses correlated frequency components that are outside of the Nyquist bandwidth.

One caveat is that the digital implementation of Gardner’s phase detector is most easily implemented with A/D sampling rate of two samples per symbol. For systems that sample less than two samples per symbol, the phase detector implementation is not obvious in the time domain, whereas the frequency-domain phase detector can be easily adapted to fractional sampling rates of less than two samples per symbol.


4.5 Second Class of Phase Detectors

The second class of phase detectors uses equalized symbols at the output of the equalizer and carrier recovery. These signals are also readily available inside the Rx DSP. The equalized symbols are represented at one sample per symbol, and therefore no information is available above the Nyquist frequency.

The principle in which they operate is quite different from the first class of phase detectors. Some examples include Mueller–Muller detector, minimum mean-squared error, and maximum-likelihood detector.

Because they derive phase error from equalized symbols, the jitter performance of these detectors is not a function of the distortion in the channel. The distortions such as dispersion and PMD are assumed compensated.

Figure 10.13 shows the placement of this type of phase detector in the Rx DSP. As the diagram shows, this placement adds significant latency compared with the frequency-domain phase detector, and that can impose a lower operating loop bandwidth.


Figure 10.13. Detecting clock phase after equalization using a second class of phase detectors.


Figure 10.14(a) shows an implementation of the Mueller–Muller phase detector. Input sequence \(x_k\) is real valued and is available at one sample per symbol. For complex-valued signals such as QPSK, one can use the circuit for the real and imaginary parts separately and combine the results.


Figure 10.14. Mueller–Muller phase detector.


The decision device makes decision on \(x_k\) to form \(a_k\). “\(a_k\)” represents the transmitted bits (or symbols), except that it may contain decision errors. Decision errors will degrade the average detector sensitivity, but at meaningful bit error rates of \(10^{-3}\) to \(10^{-2}\), the sensitivity degradation is small.

The two multipliers correlate \(x_k\) with \(a_{k−1}\), and \(x_{k−1}\) with \(a_k\). The time-averaged correlation of \(x_k\) with \(a_{k−1}\) is the same as time-averaged correlation of \(x_{k+1}\) with \(a_k\).

In the absence of decision errors, the correlators calculate the net system impulse response \(h[n]\) for \(n = 1\) and \(n = -1\). The following equation shows this relationship:


Since the phase detector is placed after equalization, the net impulse response without any clock phase error can be assumed white (i.e., fully equalized by the adaptive equalizer), and can be represented with three taps: \(h[n] = [0, 1, 0]\).

Furthermore, if the equalizer is slowly responding, then fast clock phase jitter will not cause any adjustment of the equalizer coefficients, and the clock phase error can be measured after equalizer.

Assuming a static equalizer, the clock phase error is seen as a delay (“\(\tau\)”) on the net impulse response, and will create odd symmetry in \(h[n]\). Figure 10.14(b) illustrates the odd symmetry behavior of \(h[n]\) under positive delay “\(\tau\)” and negative delay “\(\tau\)”.

By measuring the difference between \(h[1]\) and \(h[−1]\) (\(z_k\approx{h[1] − h[−1]}\)), a phase detector can be formed. This is the working principle of the Mueller–Muller phase detector.


4.6 Jitter Performance of the Phase Detectors

In this section, the jitter performances of the phase detectors described in the previous sections are compared together. The modulation format assumed is 32 Gbaud QPSK with raised cosine spectral shaping using roll-off factor \(\alpha\) = 1.0 and \(\alpha\) = 0.01.

The different phase detectors may operate in different rates. The Mueller–Muller and Gardner phase detectors produce one estimate of phase for every input symbol, whereas the frequency-domain phase detector produces one estimate for every FFT cycle.

We compare them by normalizing the detector jitter assuming a common closed-loop BW of 1 MHz. A previous section has details on the methodology of evaluating the detector jitter. Signals from both polarizations are averaged to reduce jitter by 3 dB. The received signal is assumed to have no residual distortions such as CD and PMD.


Figure 10.15.  Jitter performance of phase detectors ((a) \(\alpha\) = 1.0, (b) \(\alpha\) = 0.01).


The jitter performance of the different phase detectors versus OSNR (optical signal-to-noise ratio measured in 0.1 nm) is shown in Figure 10.15. In Figure 10.15(a), the signal pulse shape is raised cosine with roll-off factor \(\alpha\) = 1.0. The signal occupies twice the Nyquist frequency, and there is plenty of signal content above the Nyquist frequency.

The frequency-domain phase detector performs the best. Gardner and squaring phase detectors perform similarly well. The Mueller–Muller detector performs the worst amongst the four due to decision errors.

However, over the entire OSNR range, it is able to achieve jitter <0.01 UI RMS, and this is sufficiently good for QPSK modulation. For higher-order modulation Mueller–Muller detector is also sufficiently good since the operating OSNR is also higher.

In Figure 10.15(b), the pulse shape is changed to one that is most spectrally efficient, using \(\alpha\) = 0.01. The Mueller–Muller detector continues to perform well with near-Nyquist-shaped signals. Its jitter improves compared with \(\alpha\) factor of 1.0 because of the stronger \(h[1]\) and \(h[-1]\) component in the impulse response.

However, the frequency-domain phase, squaring, and Gardner phase detectors have all suffered greatly. Frequency-domain and squaring detectors are marginal, and Gardner phase detector is not functioning.

The reason is that all three detectors rely on frequency components of the signal above and below the Nyquist frequency. As the pulse shape reduces from \(\alpha\) = 1.0 to \(\alpha\) = 0.01, the power of the signal components above the Nyquist frequency is reduced dramatically.

At \(\alpha\) = 0.0, which is Nyquist transmission, the first class of phase detector will fail, and an alternative phase detector is needed. This is the subject of the next section.


4.7 Phase Detectors for Nyquist Signals

Nyquist signals are spectrally shaped to fit as closely as practically possible within the minimum Nyquist bandwidth. For 32G symbol rate modulation, the spectral occupancy would ideally be between −16 and 16 GHz, centered on the optical carrier.

This allows dense packing of WDM channels, and thereby achieving higher spectral efficiency. Practically, the transmitters in these systems are constructed using DSP and D/A converter driving into a set of I/Q modulators operating in the linear regime.

The DSP is capable of using a large number of tap coefficients to filter the signal to be spectrally as close to Nyquist as possible. One type of digital filter that achieves this goal is the class of raised-cosine filters defined by roll-off factor \(\alpha\).

Although a filter that achieves \(\alpha\) = 0 is not possible to implement, DSP computation power available at the time of this writing allows long enough filters to implement \(\alpha\) factor near 0.01.

As the previous section shows, detecting clock phase using signals of this type is particularly difficult due to lack of signal content above the Nyquist frequency. While the frequency-domain phase detector has a lot of advantages at \(\alpha\) = 1, it struggles at \(\alpha\) = 0.01, and does not function at \(\alpha\) = 0.001.

Another type of phase detector concept has emerged in order to resolve this shortcoming. The key is first realizing that the instantaneous power envelop of the input Nyquist signal can be used as an input to a conventional phase detector such as the Gardner phase detector.

It is shown that by using the familiar Gardner phase detector, and replacing the input complex signal \(x_I[n]+jx_Q[n]\) by the instantaneous power \(P[n] = x_I^2[n] + x_Q^2[n]\), the phase detector sensitivity can be recovered.

Because the input to the Gardner phase detector is a power signal, this phase detector is referred to as the fourth-power time-domain phase detector. The phase detector equation assumes input signal (\(x_I[n]+jx_Q[n]\)) sampled at two samples per symbol, and is shown in the following equation:


A frequency-domain implementation of this method is highly desirable. The first signal available inside the DSP after dispersion compensation is in the frequency domain as shown in Figures 10.10 and 10.16.


Figure 10.16. Fourth-power phase detector in the frequency domain.


The placement of frequency-domain phase detector directly after dispersion compensation coefficients achieves the lowest feedback latency. Furthermore, A/D converters may or may not be sampled at the convenient two samples per symbol, and with fractional sampling, Equation 10.37 may be difficult to implement without compromising performance.

A frequency-domain implementation of the fourth-power method is possible since the A/D converters capture the signal in its entirety, and therefore mathematical manipulation of the frequency-domain samples can be equivalent to the time-domain description of Equation 10.37.

A frequency-domain implementation can be derived by first calculating the frequency-domain representation of \(P[n]\) by convolving frequency-domain data samples at the output of the dispersion compensation coefficients as shown in Figure 10.16.

The spectral components of \(P[n]\) need to be calculated at frequency bins \(f_b/2\) (upper-side band) and \(-f_b/2\) (lower-side band), and other pairs of frequency bins separated by \(f_b\). This is consistent with the principle of the frequency-domain phase detector described in Equation 10.17.

The received \(X\)-pol signal is first digitized by two ADCs sampled at two samples per symbol, followed by (as an example) a 256-point FFT. We note that the FFT size can be larger or smaller than 256.

It is assumed that distortions impacted on the signal such as chromatic dispersion are largely compensated by the frequency-domain coefficient multiplication. The frequency-domain equalized signal \(X[i]\) (\(i = 0, 1,…, 255\)) is supplied to a phase detector that operates on the frequency-domain samples.

As shown in Figure 10.16, its output is further filtered by a loop filter which then drives the VCO tuning port, affecting the timing of the sampled signal. The phase detector algorithm is shown below. “\(\ast\)” denotes complex conjugation, “Im{⋅}” denotes imaginary part, and “\(T\)” means one symbol period.

\[\tag{10.38}\text{PD}=\sum_{k=-32\rightarrow+32}\text{Im}\left\{\left(\sum_{i=0\rightarrow63}\underbrace{X[i]X^*[i+192+k]}_{\text{freq separation}=\frac{1}{2T}-\frac{k}{128T}}\right)\left(\sum_{i=0\rightarrow63}\underbrace{X[i+192]X^*[i+k]}_{\text{freq separation}=\frac{-1}{2T}-\frac{k}{128T}}  \right)^*\right\}\]

As indicated in this equation, there is an outer summation and two inner summations inside the Im{⋅} function. The two inner summations essentially calculate the frequency bins of the power envelope of the input signal.

The method outlined also uses the power envelope of the input signal, but in the time domain. The frequency separation between two terms within the first inner summation is \(1/(2T) - k/(128T)\), whereas it is \(-1/(2T)-k/(128T)\) in the second inner summation, and the difference between them is always \(1/T\).

If there is delay in the signal, the multiplication of the two summation terms (with the second term conjugated) averages to a term proportional to \(\exp(j2\pi\tau)\), where \(\tau\) is the delay normalized to \(T\).

The Im{⋅} function then extracts the \(\sin(2\pi\tau)\) portion. The outer summation from \(k=-32\) to \(32\) is to combine all possible terms that average to \(\exp(j2\pi\tau)\), and that reduces jitter. The detector output can be approximated as \(K_d\sin(2\pi\tau)\), where “\(\tau\)” is the timing error (normalized to \(T\)) and \(K_d\) is the detector sensitivity.

The performance of the phase detector is evaluated numerically assuming 32G symbol rate QPSK, X-pol, and Y-pol phase detectors are averaged, noise loaded at OSNR=10 dB with results summarized in Figure 10.17.


Figure 10.17. Comparison of jitter performances of fourth-power phase detector and the conventional Gardner phase detector.


Note on the legend:

  • The RMS jitter is calculated based on a 1-MHz 3-dB loop bandwidth and is normalized to UI RMS (unit interval = symbol interval).
  • Fourth power (time domain) refers to the method of Equation 10.37.
  • Conventional Gardner (time domain) refers to the method of Equation 10.28.
  • Fourth power (frequency domain) refers to the method of Equation 10.38.

The following can be observed:

  • The jitter increases as \(\alpha\) decreases for the conventional Gardner (time domain). Jitter-induced system penalties are too high for \(\alpha\) factors significantly below 0.1.
  • The jitter increases as \(\alpha\) increases above 0.1 for the fourth-power time-domain method of Equation 10.37.
  • The fourth-power frequency-domain approach has better performance than the conventional Gardner for \(\alpha\) < 0.6. Its jitter performance is insensitive for all \(\alpha\) factors.

The fourth-power frequency-domain method of Equation 10.38 has the best jitter performance, but it is also more complex to implement than the fourth-power time-domain method.

Both approaches are good in solving the problem of phase detection for Nyquist-type signals. The increase of jitter at \(\alpha\) > 0.1 for the fourth-power time-domain method can be improved by prefiltering the signal before phase detection. Channel distortions such as dispersion and PMD can degrade both of these phase detectors, and they are active research topics.


5. The Chromatic Dispersion Problem

With the use of DSP to compensate for chromatic dispersion in the fiber, the signal input to the receiver can be dispersed by large amounts of dispersion. The entire fiber plant can be built without dispersion compensation (or dispersion management), and the receive DSP can be made to equalize 50,000 ps/nm, more than enough for compensating terrestrial long-haul fiber plants of the largest dispersion coefficient.

New fibers with larger CD coefficient (large core area and lower loss) are being planned in the near future with total CD as large as 250,000 ps/nm for transpacific links.

The discussion provided later applies generally to the effect of dispersion on phase detectors. Dispersion affects each signal component differently. Modulated signal components at \(+f_b/2\) and \(-f_b/2\) propagate at slightly different speeds due to dispersion, and when they arrive at the receiver, they can be time-skewed by 1 symbol period or more, and this can severely affect clock phase sensitivity.

Chromatic dispersion can be modeled as an all-pass filter \(H[k]\) on the \(E\)-field with parabolic phase response (or linear group delay). Consistent with previous assumptions, “\(f_b\)” is baud rate, \(N\) is the number of samples in the FFT, “\(k\)” is the frequency-domain index for frequency bins, and sampling rate of two samples per symbol is assumed.

Note that the exponential coefficient “\(\kappa\)” in \(H[k]\) is related to the amount of dispersion, and is not to be confused with the frequency bin index “\(k\).”


In order to understand how dispersion reduces clock phase sensitivity, we first define a function that measures detector strength.

Based on the frequency-domain phase detector derived earlier (see Equation 10.17), the following equation produces output proportional to cosine of the phase error (\(\tau\) normalized to symbol period).

At the lock point, \(\tau\) is zeroed, and this function is maximized. The maximized value is an indication of the phase detector strength. In the PLL theory, detector strength is \(K_d\) and is an important parameter in determining the loop BW.


Similar to the derivation earlier for Equation 10.17, we substitute input signal \(X[k] = A[k]H[k]H'[k]\) into the equation for detector strength. \(A[k]\) is the random data, \(H'[k]\) is an amplitude weighting function, and \(H[k]\) is the parabolic phase response in Equation 10.39. Similar to previous derivation with sampling theorem, \(A[k]\) is periodic with period \(N/2\) and is the same as \(A[k - N/2]\).


Redefine frequency bin index \(k\) in the summation, by centering the summation at the half-baud-rate frequency (\(N/4\) or \(f_b/2\)), the \(N^2/4\) term can be dropped:


This equation shows that the term inside the cosine function governs the sensitivity of the phase detector. The dispersion coefficient “\(\kappa\)” induces a phase ramp between frequency components in the summation.

This is expected, because dispersion is a linear group delay distortion, and therefore each frequency component will have different delays. If the dispersion is large enough such that the terms in the cosine function between frequency bins \(-N/4\lt{k}\lt{N/4}\) approaches \(2\pi\), then the summation approaches zero. Thus, for large dispersion, the clock phase sensitivity can be NULL.

On the contrary, if the summation range is halved, then it would take a larger \(\kappa\) (or dispersion) to induce a full \(2\pi\) inside the cosine function, and therefore reducing the range of summation can increase dispersion tolerance of the phase detector.

Figure 10.18 shows normalized detector sensitivity \(K_d\) versus dispersion generated using Equation 10.42. A 1024-point FFT is assumed (\(N=1024\)). The data spectrum \(A[k]\) is assumed to be white.

The amplitude weighting function \(H'[k]\) is assumed to be a full raised cosine function with \(\alpha\) factor of 1. Figure 10.18(a) shows that the phase detector loses sensitivity at 1000 ps/nm of dispersion at 16G symbol rate. At 32G symbol rate, the dispersion tolerance is reduced to one-quarter at 250 ps/nm.

This result is generated using all 1024 frequency bins. Figure 10.18(b) shows the benefit of using only 32 frequency bins instead of 1024 for a 32 Gbaud signal. Sixteen bins in the upper-side band centered at \(+f_b/2\) and 16 bins in the lower-side band centered at \(-f_b/2\) can be used to extend the dispersion tolerance dramatically.

However, reducing the number of frequency components increases the detector noise in that a smaller fraction of the input data is used for averaging. One must account for the performance impact of the increased jitter in this approach. All-in-all, this is an effective way of trading off dispersion tolerance and jitter performance.


Figure 10.18. (a) \(K_d\) versus dispersion at 16G and 32G symbol rates using all 1024 FFT bins. (b) \(K_d\) versus dispersion at 32G symbol rate using 32 or 1024 FFT bins.


An alternative way to compensate for the effect of dispersion in the phase detector of Equation 10.17 is first recognizing that the phase of each of the frequency component “\(k\)” in the summation is related to the delay of that frequency component.

The phase (or delay) is directly related to the dispersion in the signal. One can compensate for the effect of dispersion by introducing a frequency-dependent phase rotation applying to each of the summation terms as shown in Equation 10.43.

Each frequency bin has frequency separation of \(\Delta{f}\) (Hz). “\(\Delta\theta\)” defines a phase difference between adjacent frequency bins. “\(\Delta\theta\)” is related to the amount of dispersion \((D⋅L)\) in the signal as shown in the following:


Figure 10.19 shows a simulation of system jitter (normalized to 1MHz closed loop BW), sourced from the phase detector alone, using a 32 Gbaud PM-QPSK system with DSP designed to tolerate 50,000 ps/nm.

FFT size “\(N\)” is chosen to be 1024. The black circles show jitter result applying Equation 10.17 using 128 frequency bins centered at \(\pm{f_b/2}\). The detector can only tolerate 400 ps/nm of dispersion.

By using only one frequency bin (black diamond), the dispersion range is increased dramatically, but the jitter also suffers. The method implemented in Equation 10.43 (black squares) extends the detector tolerance out to 50,000 ps/nm.


Figure 10.19. Dispersion tolerance of phase detectors using 128 frequency bins (line with circle), 1 frequency bin (line with diamond), and 128 bins and phase rotation (line with square) as described in Equation 10.43.


6. The Polarization-Mode Dispersion Problem

Modern coherent systems utilize a polarization-diverse receiver and extensive signal processing to compensate for a distortion known as PMD in the fiber. By doing so, one can also increase fiber capacity by polarization multiplexing two independent modulated data streams, one on each polarization of the fiber.

In the receiver, each pair of A/D converters sampling the real and imaginary components of the X-polarization contains data from both X- and Y-polarizations of the transmitted field.

Since there is no effort in aligning and maintaining the alignment of polarization axis of the fiber and the receiver, the received signal on each polarization is always a linear combination of the transmitted fields. Equation 10.44 illustrates this.

The transmitted fields are represented in frequency domain as \(T_X[\omega]\) and \(T_Y [\omega]\), the received fields as “\(X[\omega]\)” and “\(Y[\omega]\),” and the fiber channel can be modeled as a 2 × 2 frequency-dependent complex matrix \(H\).


The effect of PMD is quite complex. In general, PMD induces different polarization rotations on each frequency component of the input signal. The distortion it creates is dominantly a first-order effect known as differential group delay (DGD). As shown later, DGD can severely impact phase detector sensitivity.

The second-order PMD induces differential dispersion and depolarization. It can also impact the phase detector, although the first-order PMD effect is dominant.

DGD can be modeled as a concatenation of three matrices shown in Equation 10.45. The first and last matrices are polarization rotations in Jones matrix representation with parameters \(\theta_1\), \(\phi_1\), \(\theta_2\), and \(\phi_2\). The diagonal matrix in the middle induces a time delay (\(+\tau/2\)) on X-polarization and a matching time advancement (\(-\tau/2\)) on Y-polarization. The DGD in this model is equal to \(\tau\).


The fiber’s polarization and PMD states evolve over time due to thermal and mechanical disturbances, and therefore the amount of DGD (\(\tau\)) and the polarization rotations (\(\theta_1\), \(\phi_1\), \(\theta_2\), \(\phi_2\)) all vary as a function of time.

Consider the scenario where \(\theta_1\) and \(\phi_1\) are zero and \(\theta_2\) and \(\phi_2\) are also zero, and the DGD is equal to half symbol period (\(T/2\)). The delay on \(X\) and \(Y\) polarizations are \(+T/4\) and \(-T/4\).


In the receiver, if the phase detector outputs on X and Y polarizations are summed as illustrated in Figure 10.10, then the pair of timing waves of opposite sign are summed (\(+\pi/2\) from X and \(-\pi/2\) from Y). The phase detector sensitivity is NULLed in this case. Figure 10.20 illustrates the two timing waves of opposite phase on each polarization.

Since the polarization states vary slowly, this scenario may persist for milliseconds or tens of milliseconds. This is a very long time from the point of view of clock recovery loop that has loop time constant on the order of microseconds.


Figure 10.20. Phase detector output from X-pol and Y-pol signal sum to a NULL condition.


When the polarization state of the fiber evolves to a state near this condition, the clock recovery loop will lose lock. The PMD tolerance of the receiver in this case is limited by the stability in the clock recovery loop.

In order to avoid DGD of half a symbol period, the mean PMD needs to be three times lower than the instantaneous DGD. At 32G symbol rate, the mean PMD is at most 5.2 ps.

One may wonder that if only the received X-polarization signal is used (ignoring the Y-polarization signal) then this particular NULL condition can be avoided. But consider a different polarization state where \(\theta_1\), \(\phi_1\), and \(\phi_2\) are still zero, but \(\theta_2\) is at 45\(^\circ\).

The received signal on X polarization contains a linear combination of both the transmitted signals \(T_X\) and \(T_Y\) as illustrated in Equation 10.47. Applying phase detector on the composite signal is equivalent to detecting clock phase on \(T_X⋅e^{-j\omega{T/4}}\) and \(T_Y⋅e^{+j\omega{T/4}}\) individually. The result is also the NULL condition as shown in Figure 10.20.


This NULL condition always exists and is a consequence of multiplexing two independent signals on two polarizations of the fiber and the effect of the first-order PMD delaying one signal with respect to the other.

Note that this NULL condition does not depend on \(\theta_1\) and \(\phi_1\), because the original transmitted fields \(T_X\) and \(T_Y\) are assumed to be time aligned with respect to each other.

To solve this problem, the phase detector of Equation 10.17 is modified to include a special linear combination of received signals from both polarizations. When adding them, positive-frequency components of Y-pol signal (\(Y[k]\)) are rotated by a scalar phase \(\phi_U\), and negative-frequency components of Y-pol signal (\(Y(k + N/2)\)) are rotated by a scalar phase \(\phi_L\).

The final phase detector equation is shown in the following. Similar to earlier assumptions and derivations, the received digital signals \(X(k)\) and \(Y(k)\) are samples in the frequency domain after an \(N\)-point FFT. “\(k\)” indexes the discrete frequency bins. The notations used are adopted from Equation 10.17.


It can be shown that for all values of rotation \(\theta_2\) and \(\phi_2\), there exists a combination of \(\phi_U\) and \(\phi_L\) that restores the detector sensitivity. Figure 10.21 shows the detector sensitivity \(K_d\) as a function of \(\phi_U\) and \(\phi_L\) at \(\theta_2=30^\circ\) and \(\phi_2=40^\circ\), and DGD of half symbol period. The troughs are the regions of degradation in \(K_d\) and should be avoided.


FIGURE 10.21 Phase detector sensitivity contour as a function of \(\phi_U\) and \(\phi_L\) at DGD = \(T/2\).

\(\phi_U\) and \(\phi_L\) needs to be adapted in real-time in order to track the polarization state of the fiber and continuously restore the detector sensitivity. To do so, the detector strength can simply be defined as the “real” component of the complex function in 10.48. This definition of detector strength is what is used for Equation 10.40 and stems from derivations in Equation 10.17.

\[\tag{10.49}K_d=\sum_{k=0}^{N/2-1}\text{Re}\left\{\left[X(k)+Y(k)\cdot{e}^{j\phi_U}\right]\cdot\left[X(k+N/2)+Y(k+N/2)\cdot {e^{j\phi_L}}\right]^*\right\}\]

By taking derivative of the \(K_d\) function with respect to variables \(\phi_U\) and \(\phi_L\) independently yields the following update equations for \(\phi_U\) and \(\phi_L\). A short notation \(X_U\) is used to represent the X-pol signal in the positive frequencies \(X(k)\), and \(X_L\) is used to represent the X-pol signal in the negative frequencies \(X(k + N∕2)\), and similarly for \(Y_U\) & \(Y_L\). A digital feedback integrator can be used to continuously update \(\phi_U\) and \(\phi_L\), using the gradient functions shown in the following equation.


FIGURE 10.22 (a) Convergence curves of clock phase, \(\phi_U\) and \(\phi_L\). (b) Convergence of \(\phi_U\) and \(\phi_L\) plotted on top of the detector sensitivity surface. 

Figure 10.22 shows a simulation of all three controls in close loop: clock phase recovery, \(\phi_U\) and \(\phi_L\) updates. Channel model is a T/2 DGD followed by rotation \(\theta_2\)= 30\(^\circ\), and \(\phi_2\) = 40\(^\circ\). Figure 10.22(a) shows convergence of all three parameters. Figure 10.22(b) shows the \({K_d}\) sensitivity contour as a 2D function of \(\phi_U\) and \(\phi_L\).

The contour surface is the same as that shown in Figure 10.21. Drawn on top of the contour are multiple convergence traces. Each convergence trace is a separate simulation where the parameters \(\phi_U\)and \(\phi_L\) are initialized at the surface minimum. Each trace shows the convergence from a different initial clock phase (−0.5 to 0.5 \(\text{UI}\) in steps of 0.1 \(\text{UI}\)).

Each trace converges all three loop parameters simultaneously. In all cases, the sensitivity peak is found starting from the minimum sensitivity. The loops should maintain lock as polarization angle is a time-varying function. 



Coherent optical OFDM has recently emerged as an exciting and promising approach in fiber-optic transmission. Its high spectral efficiency and robustness to channel distortion effects has sparked significant amount of research in this area and produced a number of demonstrations of real-time decoding of coherently demodulated OFDM systems.

In OFDM transmission, the subcarrier symbol rate is made small by the use of a large number of subcarriers. The data in the subcarriers are multiplexed together at the subcarrier symbol rate by the use of an IDFT (inverse discrete Fourier transform) function at the transmitter, and they are demultiplexed using DFT at the receiver. Owing to the low symbol rate, an OFDM signal is inherently very tolerant to distortion effects such as uncompensated residual dispersion and PMD.

The intersymbol-interference (ISI) caused by dispersion and PMD are compensated by the use of a cyclic prefix, where an end portion of the OFDM word is copied identically to the beginning of the OFDM word, forming a slightly longer OFDM word.

Uncompensated channel ISI does not degrade system performance as long as the time spread of the ISI remains within the guard interval defined by the cyclic prefix. Effects of timing error on an OFDM signal is treated the same way as other channel distortions. Time delay (or jitter) as well as dispersion and PMD have their respective allocations in the guard interval.

An OFDM system with guard interval designed to accommodate dispersion can also tolerate a significant amount of channel delay. As an example, a 128 Gbit/s PM-QPSK system using 256 subcarriers and guard interval of 250 ps will tolerate 1000 ps/nm of dispersion or alternatively ±125 ps of timing error with negligible system penalty.

In this sense, an OFDM system can be much more tolerant to timing error compared with a single carrier system. However, because the use of guard interval reduces spectral efficiency and capacity, the guard interval needs to be minimized as much as possible by minimizing the timing error.

It is to be noted that for large CD, the coherent modem using OFDM still requires a CD compensation block using FFT and IFFT, which is different from the modulation and demodulation functions of OFDM.

Owing to the use of cyclic prefix and DFT windowing, the problem of timing synchronization for OFDM signal is essentially determining the location of the DFT window such that a DFT can be performed in the receiver matching the opposite function in the transmitter.

The error in determining the location of the DFT window is a timing error in which the guard interval has to absorb. OFDM signals are also sensitive to residual carrier frequency offset in the received signal before the DFT function. DFT window synchronization and carrier frequency estimation are active research topics in practical implementations of OFDM transmission.

In one particular example, a frame is defined that consists of multiples OFDM symbols and a header containing two copies of identical patterns. In the receiver, the known identical patterns can be used to determine the alignment of the DFT window and as well as carrier frequency offset.



The problem of timing recovery is intimately related to the adopted method of modulation and detection in a transmission system. With the ever-increasing pressure to deliver higher capacity in the fiber and higher single-channel data rates, new techniques continue to surface.

Nyquist transmission poses its challenges in detecting the timing phase, and the solutions in literature are still slightly incomplete in dealing with PMD effects. Recent introduction of methods in which multiple signals are frequency-multiplexed at a fraction of the Nyquist rate poses further challenges for timing recovery.

Modulation techniques such as optical coherent OFDM with guard interval or without guard interval each presents their own challenges.

Driven by forces of economy and market demands on data rate, fiber-optic transmission systems are expected to evolve continuously.

Research activity in timing recovery needs to happen in concert with developments in modulation techniques and overall system evolution. The problem of timing recovery has been a fertile ground for research and is expected to be so in the future.

Share this post



Sold Out