# Polarization and Nonlinear Impairments in Fiber Communication Systems

This is a continuation from the previous tutorial - ** nonlinear optical pulse propagation**.

## 1. Introduction

Polarization is a property of waves that can oscillate with more than one orientation. Electromagnetic waves such as light exhibit polarization. Although polarization can be used as another dimension to carry information, in practice, this feature is not utilized in direct-detection optical communication systems due to the difficulties of polarization demultiplexing in the optical domain.

It was only until recently, with the advent of digital coherent detection, polarization-division multiplexing (PDM), which transmits signals on two orthogonal states of polarization (SOPs) at an identical wavelength, started to be massively used in optical communication systems.

As the full optical field information is preserved and accessible after coherent detection, optical phase and polarization can be used to encode data, which significantly increases the spectral efficiency and capacity of an optical communication system.

With advances in high-speed electronics, in today’s coherent optical communication systems, signals are first converted to digital format through analog-to-digital-convertors (ADCs) after mixing with local oscillators (LOs) in photodetectors at coherent receivers, and digital signal processing (DSP) is used to recover signals.

For example, carrier phase recovery and polarization alignment and separation, the main obstacles for analog implementations of coherent receivers in earlier years, can be realized in the electrical domain using sophisticated DSP.

Digital coherent detection revolutionizes the ways to design optical communication systems. First, linear impairments such as chromatic dispersion (CD) and polarization-mode dispersion (PMD), used to be compensated with optical methods in direct-detection systems, in principle can be completely compensated in the electrical domain using DSP in digital coherent-detection systems.

It does not mean that they do not cause any performance degradations. For example, the interaction between electronic equalizers and LO phase noise induces amplitude noise and additional phase noise, which cause penalties in coherent systems without optical dispersion compensators.

It has been shown that PMD can still degrade the performance of an optical communication system using coherent detection due to limited complexity of DSP in a real system.

Polarization-dependent loss (PDL) generates power and optical signal-to-noise-ratio (OSNR) fluctuations and repolarizes amplified spontaneous emission (ASE) noise in optical communication systems, and these effects cannot be compensated in coherent receivers either.

Second, digital coherent detection significantly changes the ways to manage nonlinear impairments in optical communication systems. Many nonlinearity mitigation techniques developed for direct-detection systems such as dispersion management are not effective anymore and become suboptimal in coherent-detection systems. PMD can be helpful to reduce fiber nonlinearities, and fiber nonlinear effects can be mitigated with DSP in both transmitters and receivers.

In direct-detection on-off-keyed (OOK) systems, fiber nonlinearities manifest themselves as timing and amplitude jitter induced by intra- and interchannel nonlinearities.

In coherent-detection systems, as information is also coded in phase and polarization, depending on system parameters such as bit rates and modulation formats, phase and polarization distortions caused by fiber nonlinearities can be the dominant nonlinear effects, and these make fiber nonlinearity impacts on coherent-detection systems significantly different from direct-detection systems and largely change the ways to manage nonlinearities in optical communication systems.

In this tutorial, polarization and nonlinear impairments in single-carrier coherent optical communication systems are discussed. To make the chapter self-contained, first basics on polarization of light are given in Section 2. The phenomena of PMD and PDL in optical communication systems are presented in Section 3.

Section 4 shows the modeling of nonlinear transmission in optical fibers. Digital coherent optical communication systems and some electrical equalization techniques are discussed in Section 5.

In Section 6, the PMD and PDL impairments in digital coherent optical communication systems are presented. Section 7 focuses on nonlinear impairments in coherent-detection optical communication systems, and nonlinearities in three different dispersion-managed systems are discussed. The tutorial is summarized in Section 8.

## 2. Polarization of Light

Light is electromagnetic waves composed of electrical field vector \(\vec{E}\) and magnetic field vector \(\vec{H}\). In free space or homogeneous isotropic nonattenuating medium, light is properly described as transverse waves, meaning that the electric field vector \(\vec{E}\) and the magnetic field vector \(\vec{H}\) are in directions perpendicular to (or “transverse” to) the direction of wave propagation, and \(\vec{E}\) and \(\vec{H}\) are also perpendicular to each other.

In optics, the electrical field vector is usually chosen to represent the electromagnetic field and monochromatic light is mathematically described as

\[\tag{6.1}\vec{E}(z,t)=[\hat{x}E_x+\hat{y}E_ye^{j\varphi}]e^{-j(\omega{t}-\beta{z}+\theta)}\]

where \(\vec{E}(z,t)\) is the electrical field vector, \(\hat{x}\) and \(\hat{y}\) are unit vectors of \(x\) and \(y\) axes, \(\omega\) is the angular frequency of the light, \(\beta\) is the wave number, \(\theta\) is the initial phase of the light, \(j=\sqrt{-1}\), \(z\) and \(t\) are the distance in the propagation direction of the light and time.

The amplitudes of the electrical field in \(x\) and \(y\) axes and the phase difference, \(\varphi\), between the electrical field in \(x\) and \(y\) axes determine the polarization of light, which describes the temporal evolution of the electrical field vector at a certain distance. Figure 6.1 shows the temporal evolution of the electrical field vector for light with linear, circular, and elliptical polarization.

Polarization of light can be described by many representations, and two widely used representations are Jones vector representation and Stokes vector representation. As shown in Equation 6.1, the amplitudes and phases of the \(x\) and \(y\) components of an electrical field vector provide the full information on completely polarized light.

The amplitude and phase information can be conveniently represented as a two-dimensional complex vector, which is called Jones vector

\[\tag{6.2}|s\rangle=\frac{1}{|E|}\begin{pmatrix}E_xe^{j\varphi_x}\\E_ye^{j\varphi_y}\end{pmatrix}=\begin{pmatrix}s_x\\s_y\end{pmatrix}\]

where \(|E|=\sqrt{E_x^2+E_y^2}\), \(\varphi_x\), and \(\varphi_y\) are the phases of \(x\) and \(y\) components. The Jones vector has unit magnitude and is usually written as

\[\tag{6.3}|s\rangle=\begin{pmatrix}\cos\theta\\\sin\theta{e}^{j\varphi}\end{pmatrix}\]

\(\varphi=0\) and \(\pm\pi\) mean linear polarization, for \(\varphi=\pm\pi/2\) and \(\theta=\pi/4\), it is circular polarization and all the other values represent elliptical polarization.

Another representation to describe polarization of light is a four-dimensional real vector called Stokes vector. Unlike Jones vector, which can only be used to describe completely polarized light, where all the frequency components of the light have the same polarization, Stokes vector can be used to describe partially polarized light.

A Stokes vector is defined as

\[\tag{6.4}\vec{S}=\begin{pmatrix}S_0\\S_1\\S_2\\S_3\end{pmatrix}=\begin{pmatrix}I_x+I_y\\I_x-I_y\\I_{\pi/4}-I_{-\pi/4}\\I_{\text{RHC}}-I_{\text{LHC}}\end{pmatrix}\]

where \(I_x\), \(I_y\), \(I_{\pi/4}\), \(I_{-\pi/4}\), \(I_\text{RHC}\), and \(I_\text{LHC}\) are the intensities of light in the states of horizontal, vertical, 45\(^\circ\) linear, -45\(^\circ\) linear, right-hand circular, and left-hand circular polarization, respectively.

The four components of a Stokes vector have the relation of \(S_1^2+S_2^2+S_3^2\le{S}_0^2\). For completely polarized light, \(S_1^2+S_2^2+S_3^2=S_0^2\). The unpolarization of a signal is quantified by degree of polarization (DOP), which is the ratio of the power of polarized component to the total power of a signal, defined as

\[\tag{6.5}\text{DOP}=\frac{\sqrt{S_1^2+S_2^2+S_3^2}}{S_0}\]

\(\text{DOP}=1\) means light is completely polarized. Monochromatic light is totally polarized, but for a signal with a certain bandwidth, it can be completely polarized, completely unpolarized, or partially polarized.

In most cases, instead of using a four-dimensional Stokes vector, a three-dimensional normalized standard Stokes vector is used, which is defined as

\[\tag{6.6}\hat{s}=\frac{1}{S_0}\begin{pmatrix}S_1\\S_2\\S_3\end{pmatrix}=\begin{pmatrix}s_1\\s_2\\s_3\end{pmatrix}\]

For completely polarized light, a three-dimensional Stokes vector can be obtained from the corresponding Jones vector

\[\tag{6.7}s_1=s_xs_x^*-s_ys_y^*,\;s_2=s_xs_y^*+s_x^*s_y,\;s_3=j(s_xs_y^*-s_x^*s_y)\]

A three-dimensional Stokes vector can be virtualized with the Poincaré sphere in Stokes space, as shown in Figure 6.2. The Poincaré sphere has a radius of 1. On the surface of the Poincaré sphere, DOP = 1, and inside the sphere, DOP < 1. The north and south poles on the Poincaré sphere represent right- and left-hand circular polarization, respectively, and linear polarization resides on the equator of the sphere.

Polarization of light can be measured with a polarimeter. As shown in Figure 6.3, it measures the total power and the powers of three perpendicular polarization states in Stokes space, \(I_x\), \(I_{\pi/4}\), and \(I_{\text{RHC}}\), from which the Stokes vector and DOP of light can be obtained.

## 3. PMD and PDL in Optical Communication Systems

### 3.1. PMD

In an ideal optical fiber, the fiber core has a perfectly circular cross-section. In this case, the fundamental mode has two orthogonal polarization states that travel at the same speed. The signal that is transmitted over the fiber is randomly polarized, but that would not matter in an ideal fiber because the two polarization states would propagate identically.

In a realistic fiber, however, there are random imperfections induced during the manufacturing process and by external environment such as stress, which break the circular symmetry. This asymmetry causes local birefringence, which manifests as a difference in the refractive indexes and thus the propagation constants for the two orthogonal polarization states

\[\tag{6.8}\Delta\beta=\beta_s-\beta_f=\frac{\omega}{c}(n_s-n_f)=\frac{\omega\Delta{n}}{c}\]

where \(\beta_s\) and \(\beta_f\) are the propagation constants in slow and fast axes and \(\Delta{n}=n_s-n_f\) is the refractive index difference between the two axes, which causes light on the two polarization states to propagate with different speeds. For standard telecommunication-type fibers, \(\Delta{n}\sim10^{-7}\), and for polarization-maintaining fibers (PMF), \(\Delta{n}\) is much larger and is \(\sim10^{-3}\).

In long transmission fibers, due to localized stress during spooling/cabling/ deployment, there are random variations in the axes of the birefringence along the fiber length, causing polarization-mode coupling where the fast and slow polarization modes from one segment split into both the fast and slow modes in the next segment.

PMD in fibers is the combined effect of local birefringence and random polarization-mode coupling along fibers, which causes random spreading of optical pulses, similar to the effects of other kinds of dispersion.

Due to random mode coupling and pulse splitting, the propagation of a pulse along a long fiber is very complicated but, surprisingly, it can be described by the principal states model.

The principal states model was originally developed by Poole and Wagner, which states that even for a long fiber, there exist two special orthogonal polarization states at the fiber input that result in an output pulse that is undistorted to the first order. These two special orthogonal polarization states are called the principal states of polarization (PSPs).

PMD can be characterized in Stokes space by the PMD vector \(\vec{\Omega}\)

\[\tag{6.9}\vec{\Omega}=\Delta\tau\widehat{p}\]

where the magnitude, \(\Delta\tau\), is differential group delay (DGD), the group-delay difference between the slow and fast principal state modes, and the unit vector, \(\widehat{p}\), points in the direction of the slower PSP, whereas the vector \(-\widehat{p}\), is in the direction of the orthogonal faster PSP.

In the time domain, PMD manifests as pulse splitting and broadening, which induces intersymbol interference (ISI) in optical communication systems. In the frequency domain, PMD causes the output SOP of a signal changes with frequency described as

\[\tag{6.10}\frac{d\hat{t}}{d\omega}=\vec{\Omega}\times\hat{t}\]

where vector \(\hat{t}\) is the output SOP.

Equation 6.10 indicates when a signal with all its frequency components having the same SOP propagates in a fiber with PMD, at the output, different frequency components of the signal will have different SOPs, which means that the signal is depolarized by PMD.

PMD vector is not constant but varies randomly with optical angular frequency. A Taylor-series expansion of \(\vec{\Omega}(\omega)\) is typically used for a signal with a large bandwidth

\[\tag{6.11}\vec{\Omega}(\omega_0+\Delta\omega)=\vec{\Omega}(\omega_0)+\vec{\Omega}_{\omega}(\omega_0)\Delta\omega+\frac{1}{2}\vec{\Omega}_{\omega\omega}(\omega_0)\Delta\omega^2+\ldots\]

where \(\vec{\Omega}(\omega_0)\), \(\vec{\Omega}_{\omega}(\omega_0)\), and \(\vec{\Omega}_{\omega\omega}(\omega_0)\) are first-, second-, and third-order PMD, and the subscript \(\omega\) indicates differentiation with angular frequency.

Second-order PMD is described by the derivative of the PMD vector

\[\tag{6.12}\vec{\Omega}_\omega=\frac{d\vec{\Omega}}{d\omega}=\Delta\tau_\omega\hat{p}+\Delta\tau\hat{p}_\omega\]

Equation 6.12 shows that second-order PMD has two components. The first term on the right-hand side of Equation 6.12 is \(\vec{\Omega}_{\omega\parallel}\), the component that is parallel to \(\vec{\Omega}(\omega_0)\), and is called polarization-dependent chromatic dispersion (PCD). The second term \(\vec{\Omega}_{\omega\perp}\) is the component that is perpendicular to \(\vec{\Omega}(\omega_0)\), which describes the change of PSP with frequency and is called depolarization. These two components cause different impairments in an optical communication system. Figure 6.4 shows a vector diagram of first- and second-order PMD.

The characteristics of first- and second-order PMD have been extensively studied and well understood, including its statistical properties. For a sufficiently long fiber, DGD is Maxwellian distributed. Its probability density function (PDF) is given by Kogelnik et al.

\[\tag{6.13}p_{\Delta\tau}(x)=\frac{8}{\pi^2\langle\Delta\tau\rangle}\left(\frac{2x}{\langle\Delta\tau\rangle}\right)^2e^{-(2x/\langle\Delta\tau\rangle)^2/\pi};\qquad{x}\ge0\]

with mean and root mean square (RMS) values being \(E(x)=\langle\Delta\tau\rangle\) and \(E(x^2)=3\pi\langle\Delta\tau\rangle^2/8\), respectively.

PMD in fiber can be modeled as a concatenation of randomly rotated waveplates, each having a certain birefringence and certain length, as shown in Figure 6.5. Simulations are typically performed in Jones space and the overall transmission matrix of Figure 6.5 is given by

\[\tag{6.14}T=T_mT_{m-1}\ldots{T_2}T_1\]

where the transmission matrix of \(i\)th waveplate is

\[\tag{6.15}T_i=R(-\theta_i,-\varphi_i)\begin{pmatrix}\exp(-j\omega\Delta\tau_i/2)&0\\0&\exp(j\omega\Delta\tau_i/2)\end{pmatrix}R(\theta_i,\varphi_i)\]

where \(\Delta\tau_i\) is the DGD of the \(i\)th waveplate, and

\[\tag{6.16}R(\theta,\varphi)=\begin{pmatrix}\cos\theta&\sin\theta\exp(j\varphi)\\-\sin\theta\exp(-j\varphi)&\cos\theta\end{pmatrix}\]

There are some variations when one simulates PMD. For example, one can choose the angle between waveplates to be totally random or have some correlation, and the DGD of each waveplate can be constant or random.

### 3.2. PDL

PDL usually occurs in optical components, such as isolators and couplers, whose insertion loss varies with the SOPs of input signals. Besides common loss, the polarization-dependent component of the power gain of an optical component with PDL can be described as \(1+\vec{\alpha}\cdot\hat{s}\), where \(\hat{s}\) is a unit Stokes vector corresponding to the SOP of the incident optical signal and \(\vec{\alpha}\) is the vector of PDL.

The highest and lowest gains are \(1\pm\alpha\) with \(\alpha=|\vec{\alpha}|\), and they are achieved when the SOP of the input signal is either parallel to \(\vec{\alpha}\) in Stokes space (highest gain) or antiparallel to it (lowest gain). In most common definition, PDL refers to the ratio of the lowest and highest losses of a PDL component in decibels

\[\tag{6.17}\Gamma=10\log_{10}\frac{1+\alpha}{1-\alpha}\]

In an optical communication system with many PDL components distributed in the system, the statistical distribution of the overall PDL expressed in decibels is Maxwellian.

Unlike PMD, PDL does not cause signal distortions, but it generates power variations along the optical link and changes of OSNR at the receiver, and repolarizes ASE noise and PDM signals as well.

These effects can significantly degrade the performance of an optical communication system. PDL can also be modeled as the concatenation of many PDL elements with randomly oriented axis. It is found that to accurately simulate the noise variations induced by PDL, ASE noise has to be distributed along the link.

## 4. Modeling of Nonlinear Effects in Optical Fibers

When polarization effects can be neglected and the signal is launched in a single polarization state, the scalar nonlinear Schrödinger equation (NLSE) is a fairly good model to study transmission impairments in optical fibers including nonlinear effects.

However, to consider polarization effects such as PMD and cross-polarization modulation (XPolM) and to study the nonlinear transmission of PDM signals in optical fibers, the coupled nonlinear Schrödinger equation (CNLSE) has to be used

\[\tag{6.18}\frac{\partial\vec{E}}{\partial{z}}-j\Delta\beta\Sigma\vec{E}+\Delta\tau\Sigma\frac{\partial\vec{E}}{\partial{t}}+\frac{j}{2}\beta_2\frac{\partial^2\vec{E}}{\partial{t^2}}=j\gamma\left[|\vec{E}|^2\vec{E}-\frac{1}{3}(\vec{E}^+\sigma_3\vec{E})\sigma_3\vec{E}\right]\]

where \(\vec{E}=[E_x,E_y]^T\) is the electrical field column vector, \(\Delta\beta\) is the birefringence parameter, \(\Delta\tau\) is the DGD parameter related to PMD coefficient, \(\Sigma\) is the local Jones matrix describing polarization changes, \(\beta_2\) is the group velocity dispersion (GVD), \(\gamma\) is the fiber nonlinear coefficient, \(\vec{E}^+=[E_x^*,E_y^*]\) is the transpose conjugate of \(\vec{E}\), and \(\sigma_3\) is one of the Pauli spin matrices

\[\tag{6.19}\sigma_3=\begin{pmatrix}0&-j\\j&0\end{pmatrix}\]

In Equation 6.18, \(z\) is the distance along the fiber axis and \(t\) is the retarded time moving at group velocity of the carrier frequency of the signal.

By averaging the nonlinear effects over the Poincaré sphere under the assumption of complete mixing (averaging over the random polarization changes that uniformly cover the Poincaré sphere) and neglecting PMD, the CNLSE can be transformed to the Manakov equation

\[\tag{6.20}\frac{\partial\vec{E}}{\partial{z}}+\frac{j}{2}\beta_2\frac{\partial^2\vec{E}}{\partial{t^2}}-j\frac{8}{9}\gamma|\vec{E}|^2\vec{E}=0\]

Assume that we have a WDM system with two channels, channels \(a\) and \(b\), and the two channels have no overlapping spectra. By neglecting four-wave mixing (FWM) between the two channels, one can separate the equations for channels \(a\) and \(b\) from the Manakov equation as

\[\tag{6.21}\frac{\partial\vec{E}_a}{\partial{z}}+\frac{j}{2}\beta_2\frac{\partial^2\vec{E}_a}{\partial{t^2}}-j\frac{8}{9}\gamma(|\vec{E}_a|^2\vec{E}_a+|\vec{E}_b|^2\vec{E}_a+\vec{E}_b^+\vec{E}_a\vec{E}_b)=0\]

\[\tag{6.22}\frac{\partial\vec{E}_b}{\partial{z}}+\frac{j}{2}\beta_2\frac{\partial^2\vec{E}_b}{\partial{t^2}}-j\frac{8}{9}\gamma(|\vec{E}_b|^2\vec{E}_b+|\vec{E}_a|^2\vec{E}_b+\vec{E}_a^+\vec{E}_b\vec{E}_a)=0\]

Within the parentheses of the two equations, the first term is self-phase modulation (SPM), the second term is polarization-independent cross-phase modulation (XPM), and the third term is polarization-dependent XPM.

SPM does not depend on the polarization, but XPM is polarization dependent. The third nonlinear term is the same as the second nonlinear term when the two channels have the same polarization and it is zero when they are orthogonally polarized, which means that the XPM between two channels with parallel polarizations is two times that with orthogonal polarizations.

The last two terms in each of Equations 6.21 and 6.22 show that XPM between channels also causes XPolM. Due to XPolM, the SOP of one channel can be changed by other channels.

Neglecting CD, one can derive the SOP evolution of channel \(a\) induced by channel \(b\) due to XPolM as

\[\tag{6.23}\frac{d\vec{S}_a}{dz}=\frac{8}{9}\gamma(\vec{S}_a\times\vec{S}_b)=\frac{8}{9}\gamma(\vec{S}_a\times\vec{S}_\text{sum})\]

where \(\vec{S}_a=(S_{a1},S_{a2},S_{a3})\) and \(\vec{S}_b=(S_{b1},S_{b2},S_{b3})\) are Stokes vectors for channels \(a\) and \(b\), respectively, and \(\vec{S}_\text{sum}=\vec{S}_a+\vec{S}_b\) is the sum of the two Stokes vectors. The evolution of \(\vec{S}_b\) can be obtained by exchanging the subscripts in Equation 6.23.

To model nonlinear polarization effects in fiber-optic communication systems, one can directly solve the CNLSE given in Equation 6.18 with the split-step Fourier method.

To increase the speed of the simulations, the CNLSE can be solved with the approach proposed by Marcuse et al. by integrating with steps small enough to follow the detailed polarization evolution and using larger steps for CD and nonlinear effects.

The other widely used method is the coarse-step method, which assumes that within each step the polarization does not change and the signal propagation is described by the following CNLSE

\[\tag{6.24}\frac{\partial{E_x}}{\partial{z}}-\frac{1}{2}\Delta\beta_1\frac{\partial{E_x}}{\partial{t}}+\frac{j}{2}\beta_2\frac{\partial^2E_x}{\partial{t^2}}=j\gamma\left(|E_x|^2+\frac{2}{3}|E_y|^2\right)E_x\]

\[\tag{6.25}\frac{\partial{E_y}}{\partial{z}}-\frac{1}{2}\Delta\beta_1\frac{\partial{E_y}}{\partial{t}}+\frac{j}{2}\beta_2\frac{\partial^2E_y}{\partial{t^2}}=j\gamma\left(|E_y|^2+\frac{2}{3}|E_x|^2\right)E_y\]

At the interval of the fiber coupling length, which is typically one or a few step sizes, the polarization of the field is randomly rotated to generate complete mixing over the Poincaré sphere. Two scattering matrices have been used to rotate signal polarizations.

One scattering matrix is

\[\tag{6.26}\begin{pmatrix}\cos\alpha\exp(i\phi)&\sin\alpha\exp(i\phi)\\-\sin\alpha&\cos\alpha\end{pmatrix}\]

and the other one is

\[\tag{6.27}\begin{pmatrix}\cos\alpha&\sin\alpha\exp(i\phi)\\-\sin\alpha\exp(-i\phi)&\cos\alpha\end{pmatrix}\]

where \(\cos2\alpha\) and \(\phi\) are randomly chosen from uniform distributions in Equation 6.26, and \(\alpha\) and \(\phi\) are randomly chosen from uniform distributions in Equation 6.27.

As shown by Marcuse et al., although neither matrix introduces a uniform scattering on the Poincaré sphere, concatenating several of these matrices does lead to rapid uniform mixing on the Poincaré sphere.

## 5. Coherent Optical Communication Systems and Signal Equalization

### 5.1. Coherent Optical Communication Systems

The block diagram of a digital coherent optical communication system is illustrated in Figure 6.6. For simplicity, only one channel of a WDM system is shown in the figure. Note that DSP is massively used in such system, not only in the receiver, but in the transmitter as well.

Typically, both polarization and phase of lightwave are used to carry information in a coherent optical communication system to increase spectral efficiency and system capacity.

In the transmitter, a continuous wave (CW) from a low linewidth laser such as an external cavity laser (ECL) is split into two parts, one for each polarization, and each part is modulated with an in-phase/quadrature (I/Q) modulator by electrical signals from digital-to-analog converters (DACs) after DSP.

Due to the use of DAC and DSP, lots of functions can be performed in the transmitter, for example, to perform predistortion and generate signals with specific waveforms and spectral shapes for purposes such as improving nonlinear tolerance and/or spectral efficiency.

During the propagation in fibers, signal polarizations are not maintained but randomly rotated. At the polarization and phase diversity receiver, the received signal is split by a polarization beam splitter (PBS).

Each polarization of the signal after the PBS is combined with a LO in a 90\(^\circ\) hybrid. The four tributaries (\(x\) and \(y\) polarizations, I/Q branches) of the combined signal and LO after the hybrids are detected by four detectors (or four pairs of balanced detectors).

After antialias filtering, the signals are sampled and converted to digital form by ADCs. The signals are then processed by DSP to recover the transmitted data, including retiming and resampling, CD compensation (nonlinearity compensation if allowed by the complexity of the DSP), clock recovery, polarization demultiplexing, PMD compensation, carrier frequency and phase estimation, symbol detection, and forward-error correction (FEC).

One of the key distinguishing features of a digital coherent optical communication system is its ability to compensate for most transmission impairments in the electrical domain with DSP.

As CD can be completely compensated with DSP in coherent receivers, it is not necessary to have optical dispersion compensators in green field systems. But most existing systems have optical dispersion compensators. It has been shown that fiber nonlinear effects are significantly different in coherent optical communication systems with optical dispersion compensators from those in the systems without optical dispersion compensators.

### 5.2. Signal Equalization

In principle, distortion equalization in a coherent receiver can be realized in one equalizer, but it is in general beneficial to perform the equalization with one static equalizer and one dynamic equalizer. As CD usually changes very slowly, a static equalizer is typically used for CD, which requires large filters and is bulky and virtually static. A dynamic equalizer typically requires much shorter adaptive filters and is used to deal with time-varying effects such as polarization demultiplexing and PMD compensation.

According to Equation 6.18, the effect of CD on an optical signal propagating in fibers can be modeled as

\[\tag{6.28}\frac{\partial\vec{E}}{\partial{z}}=-\frac{j}{2}\beta_2\frac{\partial^2\vec{E}}{\partial{t^2}}\]

The conventional approach is to solve this equation in the frequency domain. By taking Fourier transform of Equation 6.28, one can obtain the frequency-domain transfer function of CD as

\[\tag{6.29}F_\text{CD}(z,\omega)=\exp\left(\frac{j}{2}\beta_2\omega^2z\right)\]

To compensate for CD effect, CD equalizer in a coherent receiver has to provide a transfer function opposite to Equation 6.29.

There are two approaches to realize CD compensation in a coherent receiver. One is to use finite impulse response (FIR) filters. The CD-induced impulse response can be obtained by inverse Fourier transform of Equation 6.29, which is used to set the tap coefficients of the FIR filters. The other approach is to use frequency-domain equalizers.

For a short impulse response, the time-domain FIR filter approach is preferred, but for systems with large CD, the frequency-domain equalizers are more efficient.

Due to the existence of two polarizations, the dynamic equalizer has a butterfly structure, which consists of four subequalizers. Each subequalizer can be an FIR filter, as shown in Figure 6.7, where \(\tau\) is the sampling interval, which is half the symbol period for two times oversampling.

The butterfly equalizer is mainly used for polarization demultiplexing, PMD compensation, and PDL mitigation. Apart from these, the butterfly equalizer also mitigates ISI caused by other factors such as residual CD after the CD compensator, filtering, or nonlinearities in a system.

The butterfly equalizer performs multi-input-multi-output (MIMO) processing. The output of the equalizer is given by

\[\tag{6.30a}x'=\vec{h}_{xx}\cdot\vec{x}^T+\vec{h}_{xy}\cdot\vec{y}^T\]

\[\tag{6.30b}y'=\vec{h}_{yx}\cdot\vec{x}^T+\vec{h}_{yy}\cdot\vec{y}^T\]

where \(\vec{h}_{xx}\), \(\vec{h}_{xy}\), \(\vec{h}_{yx}\), and \(\vec{h}_{yy}\) are coefficient vectors for the four adaptive FIR filters, each of which has a length \(N\) taps, as shown in Figure 6.7(b).

\(\vec{h}_{xx}=(h_{xx}^1,h_{xx}^2,\ldots,h_{xx}^N)\) is an FIR filter coefficient vector, and \(\vec{x}=(x^1,x^2,\ldots,x^N)\) and \(\vec{y}=(y^1,y^2,\ldots,y^N)\) are input signal vectors. Superscript \(T\) means transpose. The equalizer is dynamically adjusted according to some criteria, with the goal to achieve the polarization demultiplexing and minimum ISI for the output signal.

Many algorithms can be used to control the butterfly equalizer, including blind equalization algorithms and data-aided algorithm. One widely used blind equalization algorithm is the constant modulus algorithm (CMA) proposed by Godard.

The CMA aims to minimize the mean square errors of the output signal deviation from a unit amplitude \(\epsilon_{x'}^2=(1-|x'|^2)^2\) and \(\epsilon_{y'}^2=(1-|y'|^2)^2\). Using stochastic gradient algorithm, the butterfly equalizer tap coefficients are recursively updated in the following way:

\[\tag{6.31a}\vec{h}_{xx}(n+1)=\vec{h}_{xx}(n)-\mu\frac{\partial(\epsilon_{x'}^2)}{\partial\vec{h}_{xx}}\]

\[\tag{6.31b}\vec{h}_{xy}(n+1)=\vec{h}_{xy}(n)-\mu\frac{\partial(\epsilon_{x'}^2)}{\partial\vec{h}_{xy}}\]

\[\tag{6.31c}\vec{h}_{yx}(n+1)=\vec{h}_{yx}(n)-\mu\frac{\partial(\epsilon_{y'}^2)}{\partial\vec{h}_{yx}}\]

\[\tag{6.31d}\vec{h}_{yy}(n+1)=\vec{h}_{yy}(n)-\mu\frac{\partial(\epsilon_{y'}^2)}{\partial\vec{h}_{yy}}\]

where \(\mu\) is a convergence parameter. After some algebraic manipulation, one can obtain the following update rule for CMA:

\[\tag{6.32a}\vec{h}_{xx}(n+1)=\vec{h}_{xx}(n)+\mu\epsilon_{x'}\vec{x}^*x'\]

\[\tag{6.32b}\vec{h}_{xy}(n+1)=\vec{h}_{xy}(n)+\mu\epsilon_{x'}\vec{y}^*x'\]

\[\tag{6.32c}\vec{h}_{yx}(n+1)=\vec{h}_{yx}(n)+\mu\epsilon_{y'}\vec{x}^*y'\]

\[\tag{6.32d}\vec{h}_{yy}(n+1)=\vec{h}_{yy}(n)+\mu\epsilon_{y'}\vec{y}^*y'\]

where the superscript * means complex conjugate.

CMA works well for quadrature-phase-shift-keying (QPSK) signals, and it can also be used for initial convergence for higher-order modulation formats such as 16-ary quadrature-amplitude modulation (16QAM) and 64QAM.

For these higher-order modulation formats, after initial convergence, some other control algorithms such as a multimodulus algorithm (MMA) or a decision-directed least mean square (DD-LMS) algorithm can be used to finely tune the equalizer for further performance improvement.

For both MMA and DD-LMS, the butterfly equalizer coefficients are updated with the same rule as Equations 6.32a–6.32d describe, the only difference is that the error functions \(\epsilon_{x'}\) and \(\epsilon_{y'}\) are different. For MMA, \(\epsilon_{x'}=R_{ix}-|x'|^2\) and \(\epsilon_{y'}=R_{iy}-|y'|^2\), where \(R_{ix}\) and \(R_{iy}\) are the moduli closest to \(|x'|^2\) and \(|y'|^2\), respectively. For 16QAM, there are three moduli. For DD-LMS, \(\epsilon_{x'}=d_x-x'\) and \(\epsilon_{y'}=d_y-y'\) with \(d_x\) and \(d_y\) being the symbols that are closest to \(x'\) and \(y'\), respectively.

## 6. PMD and PDL Impairments in Coherent Systems

In principle, the dynamic butterfly equalizer in a digital coherent receiver can generate a Jones matrix inverse to that of the optical channel and all signal distortions induced by PMD and PDL can be completely compensated.

However, PMD and PDL can still degrade the performance of a digital coherent optical communication system due to the following reasons. First, in a real system, the complexity of the butterfly equalizer is limited and it is challenging to implement a butterfly equalizer with a long length of taps; second, the butterfly is not only used for PMD compensation but mitigates any other ISI as well, irrespective of its origin; third, PDL causes signal power and OSNR fluctuations, which result in SNR variations for the received electrical signal, and degraded SNR cannot be brought back in the receiver.

Therefore, it is important for a system designer to understand the PMD and PDL impairments in a digital coherent system and know the amount of PMD and PDL that a system can tolerate. In this section, the PMD- and PDL-induced impairments in a 112-Gb/s PDM-QPSK coherent system are presented. Note that the results can be extended to other modulation formats.

### 6.1. PMD Impairment

For single-polarization (SP) signals, PMD penalties are caused by ISI, but for PDM signals, PMD penalties are mainly caused by crosstalk between two polarization tributaries.

It has been shown that, in a direct-detection system in the absence of PMD compensation, PMD impairments can be well evaluated with only first-order PMD, but in a system with PMD compensation, higher-order PMD has to be considered.

As a coherent receiver in general has a PMD equalizer, to have a clear picture of PMD impairments in a coherent optical communication system, not only first-order, but also second-order and all-order PMDs have to be considered.

Here, a 112-Gb/s PDM-QPSK coherent optical communication system is used as an example to show the PMD impairments. As shown in Figure 6.8, a transmitter generates a 112-Gb/s non-return-to-zero (NRZ) PDM-QPSK signal. A polarization controller is used to adjust the input SOP of the signal to a PMD emulator (PMDE), the PMDE can be a first-, second-, or all-order PMDEs.

ASE noise is loaded to a coherent receiver to generate required OSNR. In the coherent receiver, two times oversampling is used and the butterfly equalizer is adjusted with CMA. Bit error ratios (BERs) are calculated using a direct error counting method.

#### 6.1.1. First- and Second-Order PMD Impairments

To assess first- and second-order PMD-induced penalties in the coherent system, the PMDE is built in the following way: First-order PMD is in \(S_1\) direction and depolarization is in \(S_2\) direction in Stokes space. The Jones matrix including first- and second-order PMD can be expressed as

\[\tag{6.33}U=\begin{pmatrix}\exp\left(-j\frac{\phi}{2}\right)&-\frac{\hat{p}_\omega\Delta\omega}{2}\sin\left(\frac{\phi}{2}\right)\\\frac{\hat{p}_\omega\Delta\omega}{2}\sin\left(\frac{\phi}{2}\right)&\exp\left(j\frac{\phi}{2}\right)\end{pmatrix}\]

where \(\phi=\Delta\tau\Delta\omega+\Delta\tau_\omega\Delta\omega^2/2\), including first-order PMD and PCD term, and \(\hat{p}_\omega\) is depolarization term. Equation 6.33 shows that depolarization causes linear changes of the Jones matrix with frequency and PCD induces phase changes similar to CD.

Figure 6.9 shows first-order PMD-induced impairments in the 112-Gb/s PDM-QPSK coherent system, which depicts the dependence of BER on DGD with the butterfly equalizer in the receiver having different numbers of taps. In the figure, the received OSNR is fixed at 14.9 dB, which is 0.5-dB higher than the required OSNR at BER=10\(^{-3}\), and the input SOP of the signal (one polarization of the PDM signal) is set at 45\(^\circ\) to the PSP of the PMDE.

In a direct-detection system using SP signals, the PMD penalties are caused by pulse-broadening-induced ISI and typically gradually increase with the increasing DGD. Figure 6.9 shows that for a coherent system using PDM signals, PMD penalties have some different features.

There is a DGD threshold for the impairments. When DGD is less than the threshold, there is little penalty from PMD, but whenDGD is larger than the threshold, BER increases sharply. The reason is that the impairments are mainly caused by crosstalk between the two polarizations, and once DGD is larger than the threshold that the equalizer can handle, PMD-induced crosstalk rapidly degrades the system performance.

Note that the “threshold” is dependent on the length of the butterfly equalizer and signal launch SOP. A larger “threshold” is anticipated for the equalizer with a larger number of taps, as shown in Figure 6.9, and the signal with SOP not aligned in the worst case. This indicates that unlike that in a direct-detection system, a larger OSNR margin may not significantly increase the system tolerance to PMD in a PDM coherent system.

The effect of PCD and depolarization on the performance of the coherent system is given in Figure 6.10. In this figure, 7-tap butterfly equalizer is used. The penalty caused by PCD does not depend on the input SOP of the signal and is similar to that caused by CD of half the value, as indicated in Equation 6.33.

For example, the penalty induced by 1500-ps\(^2\) PCD is similar to that caused by 750-ps\(^2\) CD. Figure 6.10(b) shows that depolarization has little impact on the coherent system. This can be explained by Equation 6.31.

Depolarization generates a linear change of the Jones matrix with frequency, and FIR filter equalizers can compensate for linear effects efficiently, independent of the filter length. It is also found that the signal SOP has little impact on the results in Figure 6.10(b).

As the average PCD is only about 1/9 the average second-order PMD, Figure 6.10 indicates that using PCD to emulate second-order PMD effects could significantly overestimate PMD impairments.

#### 6.1.2 PMD-Induced Outage Probabilities

As PMD is a stochastic phenomenon, PMD penalties at specific PMD values and launch SOPs are not sufficient to quantify PMD impairments in an optical communication system and PMD-induced outage probabilities (OPs), the probabilities that a system is out of service caused by PMD, are often used by system designers and engineers.

When designing an optical communications system, a certain OSNR margin, which is an additional OSNR on top of the required OSNR for the system to work properly, is allocated to PMD. When the PMD-induced penalty is larger than the margin, an outage occurs.

PMD tolerance is typically quantified at an OP of 10\(^{-5}\), which means that if actual PMD value in a system is less than the PMD value that the system can tolerate, PMD will not cause larger than 10\(^{-5}\) OPs.

To accurately evaluate PMD-induced OPs, one needs to choose an appropriate PMDE. Figure 6.11 depicts the PMD-induced OPs in the 112-Gb/s NRZ-PDM-QPSK coherent system with three different PMDEs, a first-order PMDE, a PMDE including first-order PMD and PCD, and an all-order PMDE. The all-order PMDE is the concatenation of 100 waveplates, and the first- and second-order PMDEs are built based on Equation 6.33.

In order to have a fair comparison among the three PMDEs, first a Monte Carlo simulation for the all-order PMDE is performed, and for each realization of PMD, the instantaneous PMD values including both first- and second-order PMD are obtained. Then, the parameters of the first- and second-order PMDEs based on Equation 6.33 are adapted such that they generate the same amount of first- and second-order PMD values and same PSPs.

An OSNR margin of 0.5 dB is set at BER of 10\(^{-3}\) in Figure 6.11. The figure shows that only considering first-order PMD underestimates PMD impairments, and the PMD tolerance will be overestimated by 10–20%. As stated earlier, modeling second-order PMD as PCD significantly overestimates PMD impairments. If one considers first-order PMD and takes second-order PMD as PCD, the tolerable PMD will be underestimated by about 30%.

One way to increase the PMD tolerance of a coherent system is to increase the length of the butterfly equalizer, as indicated in Figure 6.12. Figure 6.12(a) plots the PMD-induced OPs in the 112-Gb/s NRZ-PDM-QPSK system with a butterfly equalizer of three different numbers of taps in the coherent receiver. It shows that increasing the butterfly equalizer complexity increases the PMD tolerance of the system, as expected.

In Figure 6.12(b), the dependence of the tolerable PMD in the PDM-QPSK system, measured in RMS DGD normalized with the symbol period, on the length of butterfly equalizer filter is given. It shows the tolerable PMD increases linearly with the number of the butterfly equalizer taps. With a 12-tap butterfly equalizer, larger than symbol period PMD can be tolerated in the system.

#### 6.1.3 Joint Optimization of Static and Dynamic Equalizers

The above-described discussions on PMD impairments assume that the dynamic butterfly equalizer in a coherent receiver is solely used for PMD compensation, but in fact the equalizer is also used to counteract other ISI impairments such as residual CD, tight optical filtering, transmitter and receiver hardware bandwidth limitations, and fiber nonlinearities.

This is because that most control algorithms used to adjust the equalizer filter coefficients such as CMA and DD-LMS do not differentiate between physical sources of ISI. Therefore, any ISI impairments in the system consume the resource of the butterfly equalizer and can reduce the capability of the butterfly equalizer to compensate PMD.

Figure 6.13 gives an example of PMD tolerance degradation caused by residual CD, where the residual CD is the CD after the CD equalizer due to inaccurate CD estimation, resulting in a slight CD compensation error.

In the figure, an all-order PMDE composed of 100 concatenated waveplates is used, and a 0.5-dB OSNR margin at a BER of 10\(^{-3}\) is allocated to PMD. It shows that the PMD tolerance of the coherent system can be significantly reduced if CD in the system is not accurately compensated by the static CD equalizer in the receiver.

For the 112-Gb/s PDM-QPSK coherent system with a 7-tap butterfly equalizer in the receiver, a 300-ps/nm residual CD can more than halve the PMD tolerance of the system, from about 22 ps to less than 10 ps at an OP of 10\(^{-5}\).

One way to increase the PMD tolerance of a coherent receiver in the presence of other ISI impairments is to increase the length of the butterfly equalizer, but this increases the complexity and power consumption of the DSP.

Another technique is to jointly optimize the CD and butterfly equalizers. The block diagram of the technique is shown in Figure 6.14. As most ISI impairments such as residual CD and filtering are the same for \(x\) and \(y\) polarizations, one can monitor the common factor of the four subequalizers of the butterfly equalizer and move it into the CD equalizer.

By doing this, the butterfly equalizer is only used to compensate for polarization-dependent impairments such as PMD and PDL, and the PMD compensation capability of the butterfly equalizer is almost fully restored. Since the CD equalizer is much larger than the butterfly equalizer, moving some functions of the butterfly equalizer to the CD equalizer has little impact on the ability of the CD equalizer to compensate for CD.

There are two methods to extract the common factor from the butterfly equalizer. The first method is to use the transfer function of any of the four subequalizers as the common factor.

For example, if \(H_{xx}\) is used for the common factor, one can write the transfer function of the butterfly equalizer as

\[\tag{6.34}H_\text{be}(f)=\begin{pmatrix}H_{xx}(f)&H_{xy}(f)\\H_{yx}(f)&H_{yy}(f)\end{pmatrix}=H_{xx}(f)\begin{pmatrix}1&H_{xy}(f)/H_{xx}(f)\\H_{yx}(f)/H_{xx}(f)&H_{yy}(f)/H_{xx}(f)\end{pmatrix}\]

The second method assumes that the butterfly equalizer is only used to compensate PMD, and thus its transfer function can be treated as a unitary matrix. One then has

\[\tag{6.35}H_\text{be}(f)=\begin{pmatrix}H_{xx}(f)&H_{xy}(f)\\H_{yx}(f)&H_{yy}(f)\end{pmatrix}=H_0(f)\begin{pmatrix}u(f)&v(f)\\-v^*(f)&u^*(f)\end{pmatrix}\]

where \(H_0(f)=\sqrt{H_{xx}(f)\ast{H}_{yy}(f)-H_{xy}(f)\ast{H}_{yx}(f)}\) is the common factor. In a real implementation, there is no need to do division operation. One only needs to update the CD equalizer with the common factor and reoptimize the butterfly equalizer with its control algorithms.

### 6.2. PDL Impairment

#### 6.2.1. PDL Effects on SP Signals

For SP signals, PDL impairments manifest as the fluctuations of signal power and OSNR. Consider an SP signal with an arbitrary polarization state and two noise modes passing through a PDL element, as shown in Figure 6.15.

The PDL value of the PDL element in dB is \(\Gamma=10\log{10}[(1+\alpha)/(1-\alpha)]\), and its lossy axis is x-axis. The input signal and the noise fields can be described as

\[\tag{6.36a}\vec{s}_\text{in}(t)=\sqrt{P_\text{in}(t)}(\hat{x}\cos\theta+\hat{y}\sin\theta)\]

\[\tag{6.36b}\vec{n}_\text{in}^\text{par}(t)=\sqrt{N_\text{in}^\text{par}(t)}(\hat{x}\cos\theta+\hat{y}\sin\theta)\]

\[\tag{6.36c}\vec{n}_\text{in}^\text{ort}(t)=\sqrt{N_\text{in}^\text{ort}(t)}(\hat{x}\cos\theta-\hat{y}\sin\theta)\]

where \(\vec{s}_\text{in}(t)\), \(\vec{n}_\text{in}^\text{par}(t)\), and \(\vec{n}_\text{in}^\text{ort}(t)\) are the signal field, the noise field polarized parallel to the signal, and the noise field polarized orthogonally to the signal, respectively, \(\hat{x}\) and \(\hat{y}\) are unit vectors, and \(P_\text{in}(t)\), \(N_\text{in}^\text{par}(t)\), and \(N_\text{in}^\text{ort}(t)\) are powers of signal, parallel noise, and orthogonal noise, respectively.

After passing through the PDL element, the output signal and noise fields are

\[\tag{6.37a}\vec{s}_\text{out}(t)=\sqrt{P_\text{in}(t)}\sqrt{1-\alpha\cos2\theta}\;\hat{p}\]

\[\tag{6.37b}\vec{n}_\text{out}^\text{par}(t)=\sqrt{N_\text{in}^\text{par}(t)}\sqrt{1-\alpha\cos2\theta}\;\hat{p}\]

\[\tag{6.37c}\vec{n}_\text{out}^\text{ort}(t)=\sqrt{N_\text{in}^\text{ort}(t)}(a_p\hat{p}+a_q\hat{q})\]

where \(\hat{p}=(\hat{x}\sqrt{1-\alpha}\cos\theta+\hat{y}\sqrt{1+\alpha}\sin\theta)/\sqrt{1-\alpha\cos2\theta}\) is the polarization direction of the signal after the PDL element, and \(\hat{q}\) is the polarization direction that is orthogonal to the signal after the PDL element, \(a_p=-\alpha\sin2\theta/\sqrt{1-\alpha\cos2\theta}\) and \(a_q=\sqrt{1-\alpha^2}/\sqrt{1-\alpha\cos2\theta}\).

The signal and noise powers after the PDL element are

\[\tag{6.38a}P_\text{out}(t)=P_\text{in}(t)(1-\alpha\cos2\theta)\]

\[\tag{6.38b}N_\text{out}^\text{par}(t)=N_\text{in}^\text{par}(t)(1-\alpha\cos2\theta)+N_\text{in}^\text{ort}(t)\frac{\alpha^2\sin^22\theta}{1-\alpha\cos2\theta}\]

\[\tag{6.38c}N_\text{out}^\text{ort}(t)=N_\text{in}^\text{ort}(t)\frac{1-\alpha^2}{1-\alpha\cos2\theta}\]

This analysis shows that PDL tends to change the polarization state of the signal and causes the orthogonal noise to couple to the parallel noise.

#### 6.2.2. PDL Effects on PDM Signals

Except for the power and OSNR variations, PDL has two additional effects on a PDM signal. First, PDL induces nonorthogonality between two originally orthogonal polarizations for a PDM signal, and second, PDL causes power imbalance between two polarization components of a PDL signal, as illustrated in Figures 6.16(a) and (b), respectively.

Using \(A\) and \(B\) to denote the two components of a PDM signal, the normalized input signal can be expressed as

\[\tag{6.39}\begin{cases}\vec{A}_\text{in}=\sin\theta\hat{x}+\cos\theta\hat{y}\\\vec{B}_\text{in}=-\cos\theta\hat{x}+\sin\theta\hat{y}\end{cases}\]

where \(\theta\) is the angle between the SOP of \(\vec{A}\) and the axis of the PDL element in Jones space. After the PDL element with \(\hat{x}\) being lossy axis, the normalized output signal is

\[\tag{6.40}\begin{cases}\vec{A}_\text{out}=\sin\theta\sqrt{1-\alpha}\;\hat{x}+\cos\theta\sqrt{1+\alpha}\;\hat{y}\\\vec{B}_\text{out}=-\cos\theta\sqrt{1-\alpha}\;\hat{x}+\sin\theta\sqrt{1+\alpha}\;\hat{y}\end{cases}\]

where is \(\alpha\) related to PDL value in dB as we stated above. From Equation 6.40, one can derive the angle between the components of the output as

\[\tag{6.41}\gamma=\arctan\left(10^{-\frac{\Gamma}{20}}\tan\theta\right)+\arctan\left(10^{-\frac{\Gamma}{20}}\cot\theta\right)\]

The change of output angle with input SOP and PDL value is plotted in Figure 6.17. It shows that the output angle decreases with the increase of PDL value and changes with the launch SOP. The largest nonorthogonality (minimum \(\gamma\)) is reached when the launched SOP is at 45\(^\circ\).

#### 6.2.3. PDL-Induced Penalties in PDM Coherent-Detection Systems

Unlike CD and PMD, PDL effects cannot be well compensated in a coherent receiver due to non-unitary nature of PDL. As shown earlier, PDL causes signal power and OSNR fluctuations and repolarizes ASE noise.

In addition, it induces loss of orthogonality and power/OSNR imbalance between the two polarizations of a PDM signal. Although the nonorthogonality between the two polarizations can be corrected by electronic equalizers, some penalties will be introduced.

When the polarization of a PDM signal is aligned with the PDL axis at the input, one polarization is improved, but the other polarization is degraded and the overall performance is mainly determined by the degraded polarization tributary.

There are two PDL models to study PDL effects in a coherent system. One is a lumped model and the other is a distributed model, as shown in Figure 6.18. In the lumped model, there is one PDL emulator. ASE noise is loaded at the receiver and a polarization controller or polarization scrambler is inserted before the PDL emulator in the lumped model to get the PDL penalty at a particular input SOP or an average PDL penalty.

For a PDM signal, the worst and best performance degradations occur when a PDM signal is 0\(^\circ\) and 45\(^\circ\) aligned to the axes of a PDL element, respectively. In the worst case, the performance of one polarization tributary is degraded while that of the other tributary is improved; therefore, the overall performance degradation induced by PDL in the worst case for a PDM signal is smaller than that for an SP signal. In the best case, PDL induces the largest nonorthogonality between the two polarization tributaries and the two tributaries have the same PDL penalty.

Figure 6.19 gives PDL induced at BER = 10\(^{-3}\) in the worst and best cases for a 112-Gb/s PDM-QPSK coherent system. The average OSNR penalty is calculated assuming that a polarization scrambler generates a uniform SOP distribution on the Poincaré sphere.

There is a large difference of the penalties among different cases. With 4-dB PDL value, the OSNR penalty is about 0.8 dB and 2.0 dB in the best and worst cases, respectively, while the average penalty is about 1.3 dB.

The lumped model, usually used in lab tests, is simple and helpful to understand some PDL effects, but it does not include all PDL effects such as repolarization of ASE noise. In the distributed model, many PDL emulators and ASE noise sources are distributed along a link, with random polarization rotations between the PDL emulators.

The distributed model is similar to a real system and automatically takes into account all PDL effects. Figure 6.20 depicts the probability distribution of OSNR variations in one polarization calculated with the lumped model and distributed model at 3-dB RMS PDL value

For the distributed model, 20 PDL elements are used, and PDL and ASE noise are equally distributed along the link. These results show that the lumped model generates much larger OSNR variations than the distributed model, which means that the lumped model will significantly overestimate the PDL penalties in an optical communication system.

As PDL is a statistical phenomenon similar to PMD, its impact on an optical communication system also needs to be quantified statistically. Figure 6.21 gives PDL-induced OPs at BER = 10\(^{-3}\) in a 112-Gb/s NRZ-PDM-QPSK system with 1- and 2-dB OSNR margins.

The results of the distributed model are obtained with Monte Carlo simulations and the lumped model results are obtained with the following equation:

\[\tag{6.42}\text{OP}=\int_0^{\infty}\int_0^{\pi}I(\text{PDL},\theta)\cdot{f}(\text{PDL})\cdot{f}(\theta)\cdot{d\theta}\cdot{d\text{PDL}}\]

where \(f(\text{PDL})\) is the PDF of PDL, which is a Maxwellian distribution. Assuming uniform distribution of SOP over the Poincaré sphere, one has \(f(\theta)=\sin\theta/2\), \(0\lt\theta\le\pi\).

\(I(\text{PDL}, \theta)\) is an outage function

\[\tag{6.43}I(\text{PDL},\theta)=\begin{cases}1,\quad\text{BER}(\text{PDL},\theta)\gt\text{BER}_0\\0,\quad\text{else}\end{cases}\]

where \(\text{BER}_0\) is the designated BER threshold. Figure 6.21 clearly shows that the lumped model significantly overestimates PDL penalties.

## 7. Nonlinear Impairments in Coherent Systems

Signals propagating in optical fibers experience rich nonlinear effects. In general, fiber nonlinear effects can be categorized into intra- and inter-channel nonlinearities. Intra-channel nonlinearities can be further categorized into SPM, intra-channel FWM (IFWM), and intra-channel XPM (IXPM).

Inter-channel nonlinearities include FWM, XPM, and XPolM. Depending on system parameters such as CD values, bit rates, and modulation formats, one or a few nonlinear effects are more dominant than the others.

For example, in a system with a low bit rate and CD value, FWM is the dominant nonlinear effect, and in a dispersion-managed homogeneous WDM PDM-QPSK system, XPolM can be the most detrimental nonlinear effect.

In most direct-detection systems, information is encoded in amplitude with OOK modulation and polarization is not used to carry any information. In such systems, it is the amplitude and timing jitter caused by inter- and intra-channel FWM and XPM that severely degrades system performance; phase noise and XPolM do not have significant impacts on the systems.

Dispersion management, which distributes optical dispersion compensators such as dispersion compensation fibers (DCF) along a link with a certain dispersion map, can significantly reduce FWM- and XPM-induced amplitude and timing jitter and is an effective technology to mitigate nonlinearities in a direct-detection system.

With the advent of digital coherent detection, it is found that dispersion management, which has been successfully used in direct-detection optical communication systems to reduce fiber nonlinear impairments, becomes suboptimal in coherent systems.

Fiber nonlinearities are the same for signals propagating in fibers, regardless of direct detection or coherent detection, and the difference is caused by the way that information is carried. In a coherent-detection system, in addition to amplitude, information is also carried by phase, and PDM is generally used to double the spectral efficiency.

As a result, phase and polarization distortions caused by XPM and XPolM, which are neglected in a direct-detection system, can cause severe impairments in a coherent-detection system.

In a coherent system without any optical dispersion compensators, as signals are rapidly spread in time due to large accumulated CD, the dominant nonlinear effects are intra-channel nonlinearities such as IFWM and IXPM.

For dispersion-compensated coherent optical communication systems with inline optical dispersion compensators, the dominant nonlinear effects are inter-channel nonlinearities, including inter-channel XPolM and inter-channel XPM, but whether XPolM or XPM is dominant depends on the actual system configurations.

Nonlinearities in non-dispersion-compensated systems are covered in another chapter in the book. This tutorial focuses on nonlinearities in dispersion-managed coherent systems, including homogeneous PDM-QPSK, hybrid PDM-QPSK, and OOK, and homogeneous PDM-16QAM. Fiber nonlinearities in these systems are described with numerical simulations.

### 7.1. System Model

The transmission system model is shown in Figure 6.22. The system has seven channels with a channel spacing of 50 GHz. Depending on the system to study, the transmitters can generate either 28-Gbaud QPSK, 28-Gbaud 16QAM, or 10-Gb/s OOK signals.

The transmission line consists of 10 spans of standard single-mode fiber (SSMF) with a CD coefficient of 17.0 ps/(nm km), a nonlinear coefficient of 1.17 (km W)−1 and a loss coefficient of 0.21 dB/km. The span length is 100 km.

Although it has been shown that more than 20 channels are needed to accurately assess the performance of a coherent system with fibers of low CD and at high nonlinear penalties, seven channels are sufficient to show the difference in nonlinearities in these three systems.

In the homogenous PDM-QPSK and in the hybrid PDM-QPSK and OOK systems, an erbium-doped-fiber amplifier (EDFA) after each span is used to compensate for the transmission loss, while in the homogeneous PDM-16QAM system, transmission loss is compensated for by hybrid Raman/EDFA to improve the delivered OSNR.

Two different dispersion maps are studied and compared, one with legacy dispersion management supporting the 10-Gb/s OOK channels, and the another without optical dispersion compensators. In the dispersion-managed system, the CD in each span is compensated by DCF with a residual dispersion per span (RDPS) of 30 ps/nm and dispersion precompensation is optimized for coherent channels, which is about −400 ps/nm.

The net residual CD after transmission is compensated in the electrical domain by DSP in the coherent receiver. The dispersion map for the dispersion-managed system used here is a typical map for a direct-detection optical communication system.

In the system without any optical dispersion compensators, the CD is entirely compensated with electrical equalizers in the coherent receiver. The dotted line modules in Figure 6.22 are used in some system configurations.

The nonlinear transmission effects in these systems are studied using the coarse-step method, that is, numerically solving Equations 6.24 and 6.25 with split-step Fourier method and randomly rotating the field every coupling length, which is set as 500 m.

Note that in the simulations, SOPs of all the channels are set in the same direction unless explicitly stated. It has been shown that system performance varies with the SOP changes of channels, but for systems with SSMF and penalties less than 2 dB, the performance variation is small.

### 7.2. Homogeneous PDM-QPSK System

For a homogeneous PDM-QPSK system, all the channels carry PDM-QPSK signals with the same bit rate. The coherent receiver has two times oversampling and the butterfly equalizer has 13 taps and is optimized with CMA.

Carrier phase recovery is performed using the Viterbi-Viterbi phase estimation method with block length of 10, and the BER is calculated with direct error counting. Differential coding and decoding is employed to avoid cycle-slip-induced error propagation.

In a homogeneous PDM-QPSK system with DCF, the amplitudes for all the channels are almost constant (neglect transient between symbols) and the same after each span and the dominant nonlinear effect is XPolM-induced nonlinear polarization scattering.

Figure 6.23 shows the required OSNR at a BER of 10\(^{-3}\) after 1000-km transmission versus launch power per channel for both 42.8-Gb/s and 112-Gb/s NRZ-PDM-QPSK coherent systems with and without inline DCF.

To separate the penalty caused by XPolM from that by XPM, we also plot the results of one 42.8-Gb/s and 112-Gb/s NRZ-PDM-QPSK channel surrounded by six 21.4-Gb/s and 56-Gb/s NRZ-SP-QPSK channels with the same symbol rate as the PDM-QPSK channel.

As shown in Figure 6.24, the SOPs of NRZ-PDM-QPSK in different symbols are at \(S_2\), \(S_3\), \(-S_2\), and \(-S_3\) on the Poincaré sphere (in \(x\) and \(y\) polarizations in the Jones space). We set the SOPs of the six surrounding NRZ-SP-QPSK channels at \(S_1\) (in \(x\) polarization in the Jones space).

By doing this, at the same launch power, the average XPM on the center NRZ-PDM-QPSK channel from the surrounding NRZ-SP-QPSK and NRZ-PDM-QPSK channels are the same.

As NRZ-SP-QPSK has constant amplitude (not considering the transient between symbols) and its SOP is the same in different symbols, XPolM from the surrounding NRZ-SP-QPSK channels cause little nonlinear polarization scattering in the middle PDM-QPSK channel. Therefore, there is almost no XPolM-induced penalty for the PDM-QPSK channel when it is surrounded by SP-QPSK channels, as shown in Figure 6.23.

Figure 6.23 shows that when all the channels carry NRZ-PDM-QPSK signals, the system with inline DCF performs worse than that without DCF. For 42.8-Gb/s and 112-Gb/s NRZ-PDM-QPSK, the maximum launch powers at 1-dB OSNR penalty for the systems with DCF are about 2-dB and 1.5-dB lower than those without DCF, respectively.

However, when the surrounding channels carry NRZ-SP-QPSK signals, the systems with inline DCF perform better than those without DCF. At 1-dB OSNR penalty, the maximum launch powers for the systems with DCF are about 2.0-dB and 1.0-dB higher than those without DCF for the 42.8-Gb/s and 112-Gb/s NRZ-PDM-QPSK, respectively.

In addition, Figure 6.23 also shows that when the surrounding channels are changed from NRZ-SP-QPSK to NRZ-PDM-QPSK, the allowed launch power is reduced by about 3 dB and 2 dB for the 42.8-Gb/s and 112-Gb/s systems with DCF, respectively, whereas it is increased by about 1 dB for both 42.8-Gb/s and 112-Gb/s systems without DCF.

This indicates that XPolM is the dominant nonlinear effect in the homogeneous NRZ-PDM-QPSK system with DCF, and it is XPolM that makes homogeneous PDM-QPSK systems with DCF perform worse than those without DCF.

The reason why the PDM-QPSK channels cause less interchannel penalty than SP-QPSK channels in the systems without DCF is because that the impact of the interchannel XPM is much larger than XPolM in the systems without DCF and the peak powers of PDM signals are smaller than those of SP signals for a given average power due to different data in the two polarizations for PDM signals.

Note that the performance difference between the 112-Gb/s PDM-QPSK systems with DCF and without DCF is smaller than that between the 42.8-Gb/s systems because of the increased symbol rate.

This conclusion is further confirmed in Figure 6.25, which shows a reduction in the DOP of 21.4-Gb/s and 56-Gb/s SP-QPSK reference channels caused by XPolM-induced depolarization from six surrounding 42.8-Gb/s and 112-Gb/s PDM-QPSK channels.

For the systems with inline DCF, the DOP decreases more rapidly with the launch power than for those without DCF, indicating that the nonlinear polarization scattering is much larger in systems with DCF than without DCF. We also note that the depolarization in the 112-Gb/s PDM-QPSK systems is smaller than that in the 42.8-Gb/s systems due to the increase of the symbol rate.

One technique to suppress XPolM in a PDM-QPSK system is to use iRZ-PDM modulation format, which can reduce or eliminate the dependence of SOP on the data carried by the two polarizations.

This modulation format uses RZ pulses and time interleaves the two polarizations by half a symbol period, as shown in Figure 6.26. One can see that at the center of each symbol, the SOP is either at \(S_1\) or \(-S_1\) on the Poincaré sphere, and it does not depend on data carried by the two polarizations.

In addition, for an iRZ-PDM signal, the SOP at each symbol alternates between \(S_1\) and \(-S_1\) on the Poincaré sphere, which causes opposite nonlinear polarization rotation according to Equation 6.23, and its signal peak power is also reduced compared with that of a time-aligned signal, leading to reduced XPolM between channels.

The transmission performance of 42.8-Gb/s and 112-Gb/s iRZ-PDM-QPSK WDM systems is given in Figure 6.27. The RZ pulses have 50% duty cycle.

For the 42.8-Gb/s system with inline DCF, using iRZ-PDM-QPSK can increase the allowed launch power by 7 dB at 1-dB OSNR penalty compared with NRZ-PDM-QPSK (Figure 6.23a), from about 1-dBm per channel launch power to about 8 dBm, and perform better than the system without DCF.

For the 112-Gb/s system with DCF, the improvement obtained by using iRZ-PDM-QPSK is smaller than that for the 42.8-Gb/s system due to the symbol rate increase, but it can still increase the launch power tolerance by about 3 dB and achieve similar performance as the system without DCF.

### 7.3. Hybrid PDM-QPSK and 10-Gb/s OOK System

When upgrading 10-Gb/s OOK systems to 100-Gb/s PDM-QPSK and higher-bit rate QAM, PDM-QPSK and QAM channels may copropagate with 10-Gb/s OOK channels. In such hybrid systems, PDM-QPSK and QAM channels can be severely degraded by copropagating 10-Gb/s OOK-channels. In these systems, the penalty is mainly caused by interchannel XPM, not XPolM, because of the nonconstant amplitude of 10-Gb/s OOK signals.

Figure 6.28 gives the performance of 112-Gb/s PDM-QPSK in a hybrid system, where one 112-Gb/s NRZ-PDM-QPSK channel is surrounded by six 10-Gb/s NRZ-OOK channels at 50-GHz channel spacing. The SOPs of the 10-Gb/s OOK channels are set at \(S_1\) in Stokes space (\(x\) polarization in Jones space) to maximum XPolM effects, and the SOP of the PDM-QPSK channel is the same as that shown in Figure 6.24.

As shown by Equation 6.23, the XPolM between two channels is the largest when their SOPs are perpendicular to each other in Stokes space. For comparison, we also plot the performance of 112-Gb/s PDM-QPSK in a homogeneous PDM-QPSK system.

Due to large interchannel nonlinearities from the OOK channels, the maximum launch powers for the systems with DCF and without DCF at 1-dB penalty are reduced by 5 and 3 dB, respectively.

Figure 6.29 depicts the DOP reduction of a 56-Gb/s SP-QPSK reference channel caused by XPolM-induced signal depolarization from the surrounding six 10-Gb/s OOK channels. Comparing the result with that in Figure 6.25(b) (note the different scale of the \(x\)-axes) shows that XPolM is larger in the hybrid system than in the homogeneous system due to the lower symbol rate (and hence the slower waveform evolution) of the 10-Gb/s OOK channels.

The fact that XPolM is not the main degrading factor in the hybrid system can be seen from Figure 6.28, where in the system with DCF at −1-dBm per-channel launch power, the OOK channels already induce more than a 3-dB penalty on the PDM-QPSK channel, while the depolarization caused by the 10-Gb/s OOK channels at this power level (Figure 6.29) is still very small (DOP above 0.98), which by itself would not cause any noticeable performance degradation for the 112-Gb/s PDM-QPSK channel.

Along the same lines, we see from Figures 6.28 and 6.29 that the penalty caused by XPolM-induced depolarization in the hybrid system without DCF is also small.

XPM effects are further illustrated in Figure 6.30. No ASE noise and laser phase noise are added. The larger phase spread of the PDM-QPSK channel is caused by XPM from the neighboring 10-Gb/s OOK channels. As the OOK channels are aligned with the \(x\)-polarization, XPM on the \(x\)-polarization of the PDM-QPSK signal is twice that on the \(y\)-polarization.

### 7.4. Homogeneous PDM-16QAM System

To get sufficient delivered OSNR for PDM-16QAM transmission, hybrid Raman/EDFA amplification is used to compensate for the transmission loss, with 15-dB on/off gain provided by the Raman amplifier.

As iRZ has been shown to have better nonlinear tolerance than NRZ, iRZ-PDM-16QAM is used here. In the PDM-16QAM receiver, polarization demultiplexing and residual distortion equalization is performed with a butterfly equalizer consisting of four 13-tap FIR filters.

These are first optimized with the CMA for preconvergence and then finely tuned with the DD-LMS algorithm. Carrier phase recovery is performed with a decision-directed phase estimation method. The BER is evaluated by direct error counting with Gray coding.

To separate SPM, XPM and XPolM, a single-channel, 112-Gb/s RZ-SP-16QAM transmission is performed, with the RDPS varying from 30 to 1700 ps/nm by changing the DCF length in each span (RDPS of 1700 ps/nm means no inline DCF).

Figure 6.31 depicts the impact of the RDPS on the transmission system performance. It shows that the penalty peaks at an RDPS of 90 ps/nm, and then decreases with the increase of RDPS.

Moreover, it shows that even for single-channel SP-16QAM, the system with DCF cannot perform better than that without DCF. This is completely different from QPSK systems, where systems with DCF can achieve better performance than those without DCF for SP-QPSK and iRZ-PDM-QPSK.

To exclude XPolM effects, SP signals are then used. Figure 6.32(a) shows the OSNR penalty at a BER of 10\(^{-3}\) after 1000-km transmission versus launch power per channel for the 112-Gb/s RZ-SP-16QAM coherent systems with and without DCF, where the RDPS is set as 30 ps/nm.

For the system with DCF, dispersion precompensation is optimized, whereas there is no dispersion precompensation for that without DCF. It shows that for both single-channel and WDM transmission, the system with DCF has less tolerance to fiber nonlinearities than that without DCF. Constellations in Figure 6.32(b)–(e) shows that in the system with DCF, SPM and XPM cause large nonlinear phase distortions, and it is this effect that significantly degrades the system performance, whereas in the system without DCF, there is almost no nonlinear phase distortion.

Nonlinear transmission performance of 224-Gb/s iRZ-PDM-16QAM is given in Figure 6.33. Similar to the SP signals, the PDM signals perform worse in the system with DCF than that without DCF for both single-channel and WDM transmission.

Compared with 112-Gb/s RZ-SP-16QAM, the launch powers per channel of 224-Gb/s iRZ-SP-16QAM are improved by 1.5–2 dB for all the cases (note that power per polarization is lower).

Constellations in Figure 6.33(b)–(e) show that it is SPM and XPM-induced nonlinear phase distortions that significantly degrade the performance of the 224-Gb/s iRZ-PDM-16QAM system with DCF.

Figure 6.34 gives the XPolM-induced depolarization in the iRZ-PDM-16QAM transmission systems, which is measured by the DOP of a 112-Gb/s RZ-SP-16QAM channel surrounded by six 224-Gb/s iRZ-PDM-16QAM channels.

For comparison, the result of the 112-Gb/s iRZ-PDM-QPSK system with DCF is also plotted in the figure. The XPolM-induced depolarization is similar in the PDM-16QAM system and in the PDM-QPSK system.

The figure again indicates that the dominant nonlinear effect in the WDM PDM-16QAM system with DCF is not XPolM, but XPM, as the OSNR penalty at 1-dBm per channel launch power in the WDM system with DCF is more than 3 dB, but the XPolM-induced depolarization is very small at this launch power (DOP is 9.93) and will not cause any noticeable penalties.

## 8. Summary

This tutorial reviews recent advances in understanding polarization effects and fiber nonlinearities in coherent optical communication systems. To make the chapter self-contained, at the beginning of the chapter, basics of polarization of light are presented.

First, polarization effects are described. Although in principle, PMD can be entirely compensated in a coherent receiver with DSP, due to the limited complexity of DSP in the coherent receiver, the PMD tolerance of a coherent receiver is limited, and is proportional to the complexity of the butterfly equalizer.

A technique to jointly optimize CD and butterfly equalizer is discussed. Due to the non-unitary nature of PDL, it cannot be well compensated in a coherent receiver. Two models to evaluate PDL impairments are presented and it is shown that to accurately evaluate the PDL penalties in coherent optical communication systems, the distributed PDL model has to be used.

Then, the nonlinear effects in coherent optical communication systems are discussed, including nonlinear transmission modeling techniques. While the dominant nonlinear effects in coherent optical communication systems without optical dispersion compensators are mostly intra-channel nonlinearities, the dominant nonlinear effects are inter-channel nonlinearities in coherent optical systems with inline DCF.

In coherent optical communication systems with DCF, when modulation formats of constant amplitude such as QPSK are used, the dominant nonlinear effect is XPolM, which generates nonlinear polarization scattering and induces severe crosstalk between polarization tributaries, whereas when the channels carry nonconstant amplitude modulation formats such as hybrid QPSK and OOK or 16QAM, XPM is the dominant nonlinear effect.

The next tutorial gives a detailed ** introduction to gradient optics**.