# Probabilistic Approaches to System Optimization

### Share this post

The works of Wiener and Shannon were the beginning of modern statistical communication theory. Both these investigators applied probabilistic methods to the problem of extracting information-bearing signals from noisy backgrounds, but they worked from different standpoints.

#### 1. Statistical Signal Detection and Estimation Theory

Wiener considered the problem of optimally filtering signals from noise, where “optimum” is used in the sense of minimizing the average squared error between the desired and the actual output. The resulting filter structure is referred to as the *Wiener filter*. This type of approach is most appropriate for analog communication systems in which the demodulated output of the receiver is to be a faithful replica of the message input to the transmitter.

Wiener’s approach is reasonable for analog communications. However, in the early 1940s, North provided a more fruitful approach to the digital communication problem, in which the receiver must distinguish between a number of discrete signals in background noise. Actually, North was concerned with radar, which requires only the detection of the presence or absence of a pulse. Since fidelity of the detected signal at the receiver is of no consequence in such signal-detection problems, North sought the filter that would maximize the peak-signal-to-root-mean-square(rms)-noise ratio at its output. The resulting optimum filter is called the *matched filter*. Later adaptations of the Wiener and matched-filter ideas to time-varying backgrounds resulted in *adaptive filters*.

The signal-extraction approaches of Wiener and North, formalized in the language of statistics in the early 1950s by several researchers were the beginnings of what is today called *statistical signal detection* and *estimation theory*. In considering the design of receivers utilizing all the information available at the channel output, Woodward and Davies determined that this so-called ideal receiver computes the probabilities of the received waveform given the possible transmitted messages. These computed probabilities are known as *a posteriori* probabilities. The ideal receiver then makes the decision that the transmitted message was the one corresponding to the largest *a posteriori* probability. This *maximum a posteriori* principle, as it is called, is one of the cornerstones of detection and estimation theory. Another development that had far-reaching consequences in the development of detection theory was the application of generalized vector space ideas.

#### 2. Information Theory and Coding

The basic problem that Shannon considered is, “Given a message source, how shall the messages produced be represented so as to maximize the information conveyed through a given channel?” Although Shannon formulated his theory for both discrete and analog sources, we will think here in terms of discrete systems. Clearly, a basic consideration in this theory is a measure of information. Once a suitable measure has been defined, the next step is to define the information carrying capacity, or simply capacity, of a channel as the maximum rate at which information can be conveyed through it. The obvious question that now arises is, “Given a channel, how closely can we approach the capacity of the channel, and what is the quality of the received message?” A most surprising, and the singularly most important, result of Shannon’s theory is that by suitably restructuring the transmitted signal, we can transmit information through a channel *at any rate less than the channel capacity with arbitrarily small error*, despite the presence of noise, provided we have an arbitrarily long time available for transmission. This is the gist of Shannon’s second theorem proceeds by selecting codewords at random from the set of 2^{n} possible binary sequences *n* digits long at the channel input. The probability of error in receiving a given n-digit sequence, when averaged over all possible code selections, becomes arbitrarily small as n becomes arbitrarily large. Thus, many suitable codes exist, *but we are not told how to find these codes*. Indeed, this has been the dilemma of information theory since its inception and is an area of active research. In recent years, great strides have been made in finding good coding and decoding techniques that are implementable with a reasonable amount of hardware and require only a reasonable amount of time to decode.

Perhaps the most astounding development in the recent history of coding was the invention of turbo coding and subsequent publication by French researchers in 1993. Their results, which were subsequently verified by several researchers, showed performance to within a fraction of a decibel of the Shannon limit.