In this series:

In my last blog post, I provided a short introduction to multiple-input multiple-output (MIMO) technology. In this blog post, I discuss the fundamental limit of the maximum error-free data rate that can be supported by the MIMO channels. The maximum error-free data rate that a channel can support is called the channel capacity. The channel capacity for additive white Gaussian noise (AWGN) channels was first derived by Claude Shannon in 1948

[1]. MIMO channels exhibit fading, encompass a spatial dimension, and are very different from AWGN channels.

In this blog series, I discuss the following aspects of MIMO channel capacity:

  • MIMO channel capacity – Information theoretic derivation
  • MIMO channel capacity – Importance of channel knowledge
  • MIMO channel capacity – Ergodic capacity

MIMO channel capacity – Information theoretic derivation

Consider a MIMO channel with Mt transmit antennas and Mr receive antennas. For simplicity, the channel is considered to be frequency flat and the channel is assumed to have a bandwidth of 1 Hz. The channel transfer matrix is denoted by H with dimension M× Mt. The input-output relation for the MIMO channel is given as

where y is the M×1 received signal vector, s is the transmit signal vector of dimension Mt×1, and n is the Mr×1 spatial-temporal white zero mean circularly symmetric complex Gaussian (ZMCSCG) noise vector with variance No in each dimension. Es is the total average energy available at the transmitter over a symbol period (this is equal to the total average transmit power since the symbol period is 1 second). The covariance matrix of s, Rss = E{ssH }, (s is assumed to have zero mean) must satisfy Tr(Rss) = Mt in order to constrain the total average energy transmitted over a symbol period.

Assuming a deterministic channel, the capacity of the MIMO channel is defined as [2] [3]

where f(s) is the probability distribution of the vector s, and I (s;y) is the mutual information between vector s and y. Note that

where H(y) is the differential entropy of the vector y, while H(ys) is the conditional differential entropy of vector y given knowledge of vector s. Since vector s and n are independent, H(ys)=H(n). Eq. (3) simplifies to

As we have no control over the noise, maximizing I(s;y) reduces to maximizing H(y). The covariance matrix of y, Ryy = E{yyH }, satisfies

Among all vector y with covariance matrix Ryy, the differential entropy H(y) is maximized when y is ZMCSCG [4]. This implies that s must be a ZMCSCG vector, the distribution of which is completely characterized by Rss. The differential entropies of y and n are given by

Therefore, I (s;y) reduces to

Thus, the capacity of the MIMO channel is given by [3]

The capacity C in (9) is also referred to as the error-free spectral efficiency, or the data rate per unit bandwidth that can be sustained reliably over the MIMO link. Thus, given a bandwidth of W Hz, the maximum achievable data rate over this bandwidth using the MIMO channel is simply WC bps.


[1] C. Shannon. A mathematical theory of communication. Bell Labs Technical Journal, 1948.[2] G. Foschini. Layered space-time architecture for wireless communication in a fading environment when using multi-element antennas. Bell Labs Technical Journal, 1996.[3] E. Telatar. Capacity of multi-antenna gaussian channels. European Transaction on Telecommunications, 10, 1999.[4] F. Neeser and J. Massey. Proper complex random process with applications to information theory. IEEE Transaction of Information Theory, vol. 39 p:1293-1302, 1993.