Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

MIMO CAPCITY LIMITS 5G COMMUNICATIONS, Lecture notes of Communications Engineering

MIMO CAPACITY LIMITS GIVEN BY A GOOD AUTHOR FROM SWEDEN

Typology: Lecture notes

2018/2019

Uploaded on 09/04/2019

srinu-sesham
srinu-sesham 🇬🇧

1 document

1 / 17

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
Massive MIMO Has Unlimited Capacity
Emil Bj¨
ornson, Member, IEEE, Jakob Hoydis, Member, IEEE, Luca Sanguinetti, Senior Member, IEEE
Abstract—The capacity of cellular networks can be improved
by the unprecedented array gain and spatial multiplexing offered
by Massive MIMO. Since its inception, the coherent interference
caused by pilot contamination has been believed to create a
finite capacity limit, as the number of antennas goes to infinity.
In this paper, we prove that this is incorrect and an artifact
from using simplistic channel models and suboptimal precod-
ing/combining schemes. We show that with multicell MMSE
precoding/combining and a tiny amount of spatial channel corre-
lation or large-scale fading variations over the array, the capacity
increases without bound as the number of antennas increases,
even under pilot contamination. More precisely, the result holds
when the channel covariance matrices of the contaminating users
are asymptotically linearly independent, which is generally the
case. If also the diagonals of the covariance matrices are linearly
independent, it is sufficient to know these diagonals (and not
the full covariance matrices) to achieve an unlimited asymptotic
capacity.
Index Terms—Massive MIMO, ergodic capacity, asymptotic
analysis, spatial correlation, multi-cell MMSE processing, pilot
contamination.
I. INTRODUCTION
The Shannon capacity of a channel manifests the spectral ef-
ficiency (SE) that it supports. Massive MIMO (multiple-input
multiple-output) improves the sum SE of cellular networks
by spatial multiplexing of a large number of user equipments
(UEs) per cell [1]. It is therefore considered a key time-
division duplex (TDD) technology for the next generation
of cellular networks [2]–[4]. The main difference between
Massive MIMO and classical multiuser MIMO is the large
number of antennas, M, at each base station (BS) whose
signals are processed by individual radio-frequency chains. By
exploiting channel estimates for coherent receive combining,
the uplink signal power of a desired UE is reinforced by
a factor M, while the power of the noise and independent
interference does not increase. The same principle holds for
the transmit precoding in the downlink. Since the channel
c
2017 IEEE. Personal use of this material is permitted. Permission from
IEEE must be obtained for all other uses, in any current or future media,
including reprinting/republishing this material for advertising or promotional
purposes, creating new collective works, for resale or redistribution to servers
or lists, or reuse of any copyrighted component of this work in other works.
E. Bj¨
ornson is with the Department of Electrical Engineering (ISY),
Link¨
oping University, 58183 Link¨
oping, Sweden (emil.bjornson@liu.se).
J. Hoydis is with Nokia Bell Labs, Paris-Saclay, 91620 Nozay, France
(jakob.hoydis@nokia-bell-labs.com). L. Sanguinetti is with the University
of Pisa, Dipartimento di Ingegneria dell’Informazione, 56122 Pisa Italy
(luca.sanguinetti@unipi.it) and also with the Large Systems and Networks
Group (LANEAS), CentraleSup´
elec, Universit´
e Paris-Saclay, 3 rue Joliot-
Curie, 91192 Gif-sur-Yvette, France. The authors have contributed equally
to this work and are listed alphabetically.
This research has been supported by ELLIIT, the Swedish Foundation for
Strategic Research (SFF), the EU FP7 under ICT-619086 (MAMMOET), and
the ERC Starting Grant 305123 MORE.
Parts of this paper were presented at the International Conference on
Communications (ICC), 21–25 May, 2017, Paris, France.
estimates are obtained by uplink pilot signaling and the pilot
resources are limited by the channel coherence time, the
same pilots must be reused in multiple cells. This leads to
pilot contamination which has two main consequences: the
channel estimation quality is reduced due to pilot interference
and the channel estimate of a desired UE is correlated with
the channels to the interfering UEs that use the same pilot.
Marzetta showed in his seminal paper [1] that the interference
from these UEs during data transmission is also reinforced
by a factor M, under the assumptions of maximum ratio
(MR) combining/precoding and independent and identically
distributed (i.i.d.) Rayleigh fading channels. This means that
pilot contamination creates a finite SE limit as M .
The large-antenna limit has also been studied for other
combining/precoding schemes, such as the minimum mean
squared error (MMSE) scheme. Single-cell MMSE (S-MMSE)
was considered in [5]–[7], while multicell MMSE (M-MMSE)
was considered in [8], [9]. The difference is that with M-
MMSE, the BS makes use of estimates of the channels from
the UEs in all cells, while with S-MMSE, the BS only uses
channel estimates of the UEs in the own cell. In both cases,
the SE was proved to have a finite limit as M , under the
assumption of i.i.d. Rayleigh fading channels (i.e., no spatial
correlation). In contrast, there are special cases of spatially
correlated fading that give rise to rank-deficient covariance
matrices [10]–[12]. If the UEs that share a pilot have rank-
deficient covariance matrices with orthogonal support, then
pilot contamination vanishes and the SE can increase without
bound. The covariance matrices R1and R2have orthogonal
support if R1R2=0. To understand this condition, note that
for arbitrary covariance matrices
R1=a c
c?bR2=d f
f?e(1)
every element of R1R2must be zero. The first element is
ad +cf?. If we model the practical covariance matrices of
two randomly located UEs as realizations of a random variable
with continuous distribution, then ad +cf?= 0 occurs with
zero probability.1Hence, orthogonal support is very unlikely
in practice, although one can find special cases where it is
satisfied. The one-ring model for uniform linear arrays (ULAs)
gives orthogonal support if the channels have non-overlapping
angular support [10]–[12], but the ULA microwave measure-
ments in [13] show that the angular support of practical
channels is highly irregular and does not lead to orthogonal
support. In conclusion, practical covariance matrices do not
have orthogonal support, at least not at microwave frequencies.
1For any continuous random variable x, the probability that xtakes a
particular realization is zero, while the probability that xtakes a realization
in a certain interval can be non-zero. Hence, if x=ad +cf?then x= 0
occurs with zero probability.
arXiv:1705.00538v4 [cs.IT] 9 Nov 2017
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download MIMO CAPCITY LIMITS 5G COMMUNICATIONS and more Lecture notes Communications Engineering in PDF only on Docsity!

Massive MIMO Has Unlimited Capacity

Emil Bj¨ornson, Member, IEEE, Jakob Hoydis, Member, IEEE, Luca Sanguinetti, Senior Member, IEEE

Abstract—The capacity of cellular networks can be improved by the unprecedented array gain and spatial multiplexing offered by Massive MIMO. Since its inception, the coherent interference caused by pilot contamination has been believed to create a finite capacity limit, as the number of antennas goes to infinity. In this paper, we prove that this is incorrect and an artifact from using simplistic channel models and suboptimal precod- ing/combining schemes. We show that with multicell MMSE precoding/combining and a tiny amount of spatial channel corre- lation or large-scale fading variations over the array, the capacity increases without bound as the number of antennas increases, even under pilot contamination. More precisely, the result holds when the channel covariance matrices of the contaminating users are asymptotically linearly independent, which is generally the case. If also the diagonals of the covariance matrices are linearly independent, it is sufficient to know these diagonals (and not the full covariance matrices) to achieve an unlimited asymptotic capacity. Index Terms—Massive MIMO, ergodic capacity, asymptotic analysis, spatial correlation, multi-cell MMSE processing, pilot contamination.

I. INTRODUCTION

The Shannon capacity of a channel manifests the spectral ef- ficiency (SE) that it supports. Massive MIMO (multiple-input multiple-output) improves the sum SE of cellular networks by spatial multiplexing of a large number of user equipments (UEs) per cell [1]. It is therefore considered a key time- division duplex (TDD) technology for the next generation of cellular networks [2]–[4]. The main difference between Massive MIMO and classical multiuser MIMO is the large number of antennas, M , at each base station (BS) whose signals are processed by individual radio-frequency chains. By exploiting channel estimates for coherent receive combining, the uplink signal power of a desired UE is reinforced by a factor M , while the power of the noise and independent interference does not increase. The same principle holds for the transmit precoding in the downlink. Since the channel

© c2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. E. Bj¨ornson is with the Department of Electrical Engineering (ISY), Link¨oping University, 58183 Link¨oping, Sweden (emil.bjornson@liu.se). J. Hoydis is with Nokia Bell Labs, Paris-Saclay, 91620 Nozay, France (jakob.hoydis@nokia-bell-labs.com). L. Sanguinetti is with the University of Pisa, Dipartimento di Ingegneria dell’Informazione, 56122 Pisa Italy (luca.sanguinetti@unipi.it) and also with the Large Systems and Networks Group (LANEAS), CentraleSup´elec, Universit´e Paris-Saclay, 3 rue Joliot- Curie, 91192 Gif-sur-Yvette, France. The authors have contributed equally to this work and are listed alphabetically. This research has been supported by ELLIIT, the Swedish Foundation for Strategic Research (SFF), the EU FP7 under ICT-619086 (MAMMOET), and the ERC Starting Grant 305123 MORE. Parts of this paper were presented at the International Conference on Communications (ICC), 21–25 May, 2017, Paris, France.

estimates are obtained by uplink pilot signaling and the pilot resources are limited by the channel coherence time, the same pilots must be reused in multiple cells. This leads to pilot contamination which has two main consequences: the channel estimation quality is reduced due to pilot interference and the channel estimate of a desired UE is correlated with the channels to the interfering UEs that use the same pilot. Marzetta showed in his seminal paper [1] that the interference from these UEs during data transmission is also reinforced by a factor M , under the assumptions of maximum ratio (MR) combining/precoding and independent and identically distributed (i.i.d.) Rayleigh fading channels. This means that pilot contamination creates a finite SE limit as M → ∞. The large-antenna limit has also been studied for other combining/precoding schemes, such as the minimum mean squared error (MMSE) scheme. Single-cell MMSE (S-MMSE) was considered in [5]–[7], while multicell MMSE (M-MMSE) was considered in [8], [9]. The difference is that with M- MMSE, the BS makes use of estimates of the channels from the UEs in all cells, while with S-MMSE, the BS only uses channel estimates of the UEs in the own cell. In both cases, the SE was proved to have a finite limit as M → ∞, under the assumption of i.i.d. Rayleigh fading channels (i.e., no spatial correlation). In contrast, there are special cases of spatially correlated fading that give rise to rank-deficient covariance matrices [10]–[12]. If the UEs that share a pilot have rank- deficient covariance matrices with orthogonal support, then pilot contamination vanishes and the SE can increase without bound. The covariance matrices R 1 and R 2 have orthogonal support if R 1 R 2 = 0. To understand this condition, note that for arbitrary covariance matrices

R 1 =

[

a c c?^ b

]

R 2 =

[

d f f?^ e

]

every element of R 1 R 2 must be zero. The first element is ad + cf ?. If we model the practical covariance matrices of two randomly located UEs as realizations of a random variable with continuous distribution, then ad + cf?^ = 0 occurs with zero probability.^1 Hence, orthogonal support is very unlikely in practice, although one can find special cases where it is satisfied. The one-ring model for uniform linear arrays (ULAs) gives orthogonal support if the channels have non-overlapping angular support [10]–[12], but the ULA microwave measure- ments in [13] show that the angular support of practical channels is highly irregular and does not lead to orthogonal support. In conclusion, practical covariance matrices do not have orthogonal support, at least not at microwave frequencies. (^1) For any continuous random variable x, the probability that x takes a particular realization is zero, while the probability that x takes a realization in a certain interval can be non-zero. Hence, if x = ad + cf?^ then x = 0 occurs with zero probability.

arXiv:1705.00538v4 [cs.IT] 9 Nov 2017

The literature contains several categories of methods for mitigation of pilot contamination, also known as pilot decon- tamination. The first category allocates pilots to the UEs in an attempt to find combinations where the covariance matrices have relatively different support [10]–[12], [14]. This method can substantially reduce pilot contamination, but can only remove the finite limit in the unlikely special case when the covariance matrices have orthogonal support. The second cate- gory utilizes semi-blind estimation to separate the subspace of desired UE channels from the subspace of interfering channels [15]–[19]. This method can fully remove pilot contamination if M and the size of the channel coherence block go jointly to infinity [18]. Unfortunately, the channel coherence is fixed and finite in practice (this is why we cannot give unique pilots to every cell), thus we cannot approach this limit in practice. The third category uses multiple pilot phases with different pilot sequences to successively eliminate pilot contamination [20], [21], without the need for statistical information. However, the total pilot length is larger or equal to the total number of UEs, which would allow allocating mutually orthogonal pilots to all UEs and thus trivially avoiding the pilot contamination prob- lem. This is not a scalable solution for networks with many cells. The fourth category is pilot contamination precoding that rejects interference by coherent joint transmission/reception over the entire network [22], [23]. This method appears to achieve an unbounded SE, but this has not been formally proved and requires that the data for all UEs is available at every BS, which might not be feasible in practice. In summary, it appears that pilot contamination is a funda- mental issue that manifests a finite SE limit, except in unlikely special cases. We show in this paper that this is basically a misunderstanding, spurred by the popularity of analyzing suboptimal combining/precoding schemes, such as MR and S-MMSE, and focusing on unrealistic i.i.d. Rayleigh fading channels (as in the prior work [8], [9] on M-MMSE). We prove that the SE increases without bound in the presence of pilot contamination when using M-MMSE combining/precoding, if the pilot-sharing UEs have asymptotically linearly indepen- dent covariance matrices. Note that R 1 and R 2 in (1) are linearly independent if [a b c]T^ and [d e f ]T^ are non-parallel vectors, which happens almost surely for randomly generated covariance matrices. Hence, our results rely on a condition that is most likely satisfied in practice—it is the general case, while prior works on the asymptotics of Massive MIMO have considered practically unlikely special cases. In contrast to prior work, no multicell cooperation is utilized herein and there is no need for orthogonal support of covariance matrices. In the conference paper [24], we proved the main result in a two-user uplink scenario.^2 In this paper, we prove the result for both uplink and downlink in a general setting. Section II proves and explains the intuition of the results in a two-user setup, while Section III generalizes the results to a multicell

(^2) After submitting our conference paper [24], the related work [25] appeared. That paper considers the mean squared error in the uplink data detection of a single cell with multiple UEs per pilot sequence. The authors show that the error goes asymptotically to zero when having linearly independent covariance matrices. However, the paper [25] contains no mathematical analysis of the achievable SE.

setup. The results are demonstrated numerically in Section IV and the main conclusions are summarized in Section V. Notation: The Frobenius and spectral norms of a matrix X are denoted by ‖X‖F and ‖X‖ 2 , respectively. The superscripts T,? (^) and H (^) denote transpose, conjugate, and Hermitian trans- pose, respectively. We use , to denote definitions, whereas NC( 0 , R) denotes the circularly symmetric complex Gaussian distribution with zero mean and covariance matrix R. The expected value of a random variable x is denoted by E{x} and the variance is denoted by V{x}. The N × N identity matrix is denoted by IN , while (^0) N is an N × N all-zero matrix and (^1) N is an N × 1 all-one vector. We use an  bn to denote an − bn →n→∞ 0 (almost surely (a.s.)) for two (random) sequences an, bn.

II. ASYMPTOTIC SPECTRAL EFFICIENCY IN A TWO-USER

SCENARIO

In this section, we prove and explain our main result in a two-user scenario, where a BS equipped with M antennas communicates with UE 1 and UE 2 that are using the same pilot. This setup is sufficient to demonstrate why M-MMSE combining and precoding reject the coherent interference caused by pilot contamination. We consider a block-fading model where each channel takes one realization in a coherence block of τc channel uses and independent realizations across blocks. We denote by hk ∈ CM^ the channel from UE k to the BS and consider Rayleigh fading with hk ∼ NC ( 0 , Rk) for k = 1, 2 , where Rk ∈ CM^ ×M^ with^3 tr(Rk) > 0 is the chan- nel covariance matrix, which is assumed to be known at the BS. The Gaussian distribution models the small-scale fading whereas the covariance matrix Rk describes the macroscopic effects. The normalized trace βk = (^) M^1 tr (Rk) determines the average large-scale fading between UE k and the BS, while the eigenstructure of Rk describes the spatial channel correlation. A special case that is convenient for analysis is i.i.d. Rayleigh fading with Rk = βkIM [26], but it only arises in fully isotropic fading environments. In general, each covariance matrix has spatial correlation and large-scale fading variations over the array, represented by non-zero off-diagonal elements and non-identical diagonal elements, respectively.

A. Uplink Channel Estimation We assume that the BS and UEs are perfectly synchronized and operate according to a TDD protocol wherein the data transmission phase is preceded by an uplink pilot phase for channel estimation. Both UEs use the same τp-length pilot sequence φ ∈ Cτp^ with elements such that ‖φ‖^2 = φHφ = 1. The received uplink signal Yp^ ∈ CN^ ×τp^ at the BS is given by

Yp^ =

ρtrh 1 φT^ +

ρtrh 2 φT^ + Np^ (2)

where ρtr^ is the normalized pilot power and Np^ ∈ CN^ ×τp^ is the normalized receiver noise with all elements independently distributed as NC(0, 1). The matrix Yp^ is the observation that (^3) This assumption implies that there is non-zero energy received from and transmitted to each UE.

M → ∞. Hence, an unlimited asymptotic SE is simultaneously achievable for both UEs. Since the SE is a lower bound on capacity, we conclude that the asymptotic capacity is also unlimited.

Observe that if R 1 and R 2 are linearly dependent, i.e., R 1 = ηR 2 , then Assumption 2 does not hold. Under these circumstances, ˆh 2 = (^1) η ˆh 1 and by applying Lemma 5 in Appendix A we obtain

γ 1 ul = hˆH 1 Z−^1 hˆ 1 1 + (^) η^12 hˆH 1 Z−^1 ˆh 1

from which, it is straightforward to show that γul 1  η^2 (by dividing and multiplying each term by M and using Lemma 3 in Appendix A). This implies that SEul 1 converges to a finite quantity when M → ∞, as Marzetta showed in his seminal paper [1] for the special case of R 1 = ηR 2 = IM.

C. Downlink Data Transmission

During the downlink data transmission, the BS transmits the signal x ∈ CM^. This signal is given by x =

√ ρdlw^1 ς^1 + ρdlw 2 ς 2 , where ςk ∼ NC(0, 1) is the information-bearing signal transmitted to UE k, ρdl^ is the normalized downlink transmit power, and wk is the precoding vector associated with UE k. This precoding vector satisfies E

‖wk‖^2

so that E

‖wkςk‖^2

= ρdl^ is the downlink transmit power allocated to UE k. The received downlink signal z 1 at UE 1 is^4

z 1 =

ρdlhH 1 w 1 ς 1 +

ρdlhH 1 w 2 ς 2 + n 1

ρdlE {hH 1 w 1 } ς 1 +

ρdl(hH 1 w 1 − E {hH 1 w 1 })ς 1

ρdlhH 1 w 2 ς 2 + n 1 (10)

where n 1 ∼ NC(0, 1) is the normalized receiver noise. The first term in (10) is the desired signal received over the deterministic average precoded channel E {hH 1 w 1 }, while the remaining terms are random variables with unknown realiza- tions. By treating these terms as noise in the signal detection [5], [26], the downlink ergodic channel capacity of UE 1 can be lower bounded by

SEdl 1 =

τp τc

log 2

1 + γdl 1

[bit/s/Hz] (11)

with the effective SINR

γ 1 dl =

|E{hH 1 w 1 }|^2 E {|hH 1 w 2 |^2 } + V {hH 1 w 1 } + (^) ρ^1 dl

Since UE 1 only needs to know E {hH 1 w 1 } and the total variance of the second to fourth term in (10), the SE in (11) is achievable in the absence of downlink channel estimation. In contrast to the uplink, there is no precoding that is always optimal [28]. However, motivated by uplink-downlink duality

(^4) For notational convenience, we treat hH 1 and hH 2 as the downlink channels, instead of hT 1 and hT 2. This has no impact on the SE since the difference is only in a complex conjugate.

[9], a reasonable suboptimal choice is the so-called MMSE precoding

wk =

vk √ E {‖vk‖^2 }

ϑk

k=

h^ ˆk ˆhH k + Z

h^ ˆk (13)

where vk = (

k= hˆk ˆhH k +Z)

− (^1) hˆk is MMSE combining and ϑk = (E

‖vk‖^2

)−^1 is a scaling factor. The following is the second main result of this paper: Theorem 2. If MMSE precoding is used, then under Assump- tions 1 and 2 the effective SINR γdl 1 increases unboundedly as M → ∞. Hence, SEdl 1 increases unboundedly as M → ∞. Proof: The proof is given in Appendix D. This theorem shows that, under the same conditions as in the uplink, the downlink SE (and thus the capacity) increases without bound as M → ∞. The asymptotic SE growth is proportional to log 2 (M ), since the proof in Appendix D shows that γ 1 dl /M has a non-zero asymptotic limit. UE 2 can simultaneously achieve an unbounded SE, which is proved directly by interchanging the UE indices.

D. Interpretation and Generality Theorems 1 and 2 show that the SE (and thus the capac- ity) under pilot contamination is asymptotically unlimited if Assumption 2 holds. To gain an intuitive interpretation of this underlying assumption, recall from (3) that ˆh 1 = R 1 a and ˆh 2 = R 2 a, where a = √^1 ρtr^ Q−^1 Ypφ∗^ is the same for both UEs. Hence, hˆ 1 and hˆ 2 are (asymptotically) linearly independent when R 1 and R 2 are (asymptotically) linearly independent, except for special choices of a. As illustrated in Fig. 1, it is then possible to find a combining vector v 1 (or precoding vector w 1 ) that is orthogonal to hˆ 2 , while being non-orthogonal to hˆ 1. Similarly, one can find v 2 (and w 2 ) such that v 2 H hˆ 1 = 0 and vH 2 hˆ 2 6 = 0. For example, if we define H^ ˆ = [hˆ 1 hˆ 2 ] ∈ CM^ ×^2 , then the zero-forcing (ZF) combining vectors [ v 1 v 2

]

= Hˆ

HˆH^ Hˆ

satisfy these conditions. Note that HˆH^ Hˆ is only invertible if the channel estimates (columns in Hˆ) are linearly independent. Using ZF as defined in (14), we get vH 1 hˆ 2 = 0 and vH 1 hˆ 1 =

  1. If the channel estimates are also asymptotically linearly independent, it follows^5 that ‖v 1 ‖^2 → 0 as M → ∞; that is, we can reject the coherent interference and get unit signal gain, while at the same time using the array gain to make the noise term (^) ρ^1 ul vH 1 v 1 = (^) ρ^1 ul ‖v 1 ‖^2 vanish asymptotically. Since optimal MMSE combining (and also MMSE precoding) provides a higher SINR than the heuristic ZF scheme in (14), it also rejects the coherent interference while retaining an array gain that grows with M. To further explain the implications of Assumption 2, we provide the following three examples. (^5) Notice that, by applying Lemma 3 in Appendix A, we have 1 M [ HˆH^ Hˆ]nm  1 M tr(RnQ − (^1) Rm). If the channel estimates are asymp- totically linearly independent, then (^) M^1 HˆH^ Hˆ is invertible as M → ∞ and thus ‖v 1 ‖^2 = (^) M^1

[ ( (^) M^1 HˆH^ Hˆ)−^1 ] 11  0.

hˆ 1

h^ ˆ 2

Orthogonal only to hˆ 2

v 1

Fig. 1: If the pilot-contaminated channel estimates are linearly independent (i.e., not parallel), there exists a combining vector v 1 that rejects the pilot-contaminated interference from UE 2 in the uplink, while the desired signal remains due to v 1 H hˆ 1 6 = 0. Similarly, if w 1 = v 1 /

√ E{‖v 1 ‖^2 } is used as precoding vector, then no pilot- contaminated coherent interference is caused to UE 2 in the downlink.

Example 1. Consider a two-user scenario with

R 1 =

[

2 IN 0

0 IM −N

]

R 2 = IM (15)

where the covariance matrices have full rank and are only different in the first N dimensions. For any given M , we notice that the argument of (8) for UE i = 1 becomes

inf λ 2

M

‖R 1 + λ 2 R 2 ‖^2 F

= inf λ 2

N (2 + λ 2 )^2 + (M − N )(1 + λ 2 )^2 M

(M − N )N

M 2

where the infimum is attained by λ 2 = −(M + N )/M. Note that (16) goes to zero as M → ∞ if N is constant, while it has the non-zero limit (1 − α)α if N = αM , for some 0 < α < 1. In the latter case, the matrices {R 1 , R 2 } satisfy (8). Interestingly, although the covariance matrices are diagonal, they are still asymptotically linearly independent and the subspace in which they differ has rank min(N, M − N ) = M min(α, (1 − α)), which is proportional to M. Let us further exemplify the interference rejection by consid- ering ZF combining, which provides lower SINR than MMSE combining, but gives more intuitive expressions. Assume for the sake of simplicity that the channel realizations are such that √^1 ρtr^ Q−^1 Ypφ∗^ = (^1) M , which gives ˆh 1 = R 11 M =

[2 1 T N 1 T M −N ]T^ and hˆ 2 = R 21 M = (^1) M. The ZF combining vectors are then given by

[ v 1 v 2

]

= Hˆ

HˆH^ Hˆ

[ 1

N 1 N^ −^

1 N 1 N − (^) M 1 −N 1 M −N (^) M 2 −N 1 M −N

]

If we set ρul^ = ρul^ = 1 for simplicity, the instantaneous effective SINR in (5) for UE 1 becomes

γul 1 = |vH 1 ˆh 1 |^2 |vH 1 hˆ 2 |^2 +

k=1 v

H 1 (Rk^ −^ Φk)v^1 +^ ‖v^1 ‖^2 =

0 + 47 N + 3(M^4 −N ) + N (MM −N )

where the coherent interference from UE 2 is zero. The remaining terms go asymptotically to zero if N = αM , for 0 < α < 1 , in which case γul 1 grows without bound, as expected from Theorem 1.

In the second example, we consider a scenario where Assumption 2 is not satisfied.

Example 2. Channels with i.i.d. fading, where the covariance matrices are R 1 = β 1 IM and R 2 = β 2 IM , are a notable case when the covariance matrices are not linearly independent. However, any such case is non-robust to perturbations of the matrix elements. Suppose we replace R 1 with

R 1 = β 1

. 0 M

where  1 ,... , M are i.i.d. positive random variables. This modeling is motivated by the measurement results in [29], which shows that there are a few dB of large-scale fading variations over the antennas in a ULA. For UE i = 1, we have

lim inf M

inf λ 2

M

‖R 1 + λ 2 R 2 ‖^2 F

= lim inf M inf λ 2

M

∑^ M

m=

(β 1 m + λ 2 β 2 )^2

(a) = lim inf M β^21

M

∑^ M

m=

m −

M

∑^ M

n=

n

(b) = β^21 E{(m − E{m})^2 } (20)

where (a) is obtained from the fact that λ 2 = − β β^12 M^1

∑M

n=1 n minimizes (^) M^1

∑M

m=1(β^1 m^ +^ λ^2 β^2 )

(^2) and (b) follows from the strong law of large numbers. Note that E{(m − E{m})^2 } in the last expression is the variance of m. Since every random variable has non-zero variance and β 1 > 0 , we conclude that {R 1 , R 2 } satisfy (8) and thus Assumption 2 holds.

The key implication from Example 2 is that all cases where R 1 and R 2 are equal (up to a scaling factor) are non-robust to random perturbations and thus anomalies. Since practical propagation environments are irregular and behave randomly (see the measurements reported in [13], [29]), linearly depen- dent covariance matrices are not appearing in practice and Assumption 2 is generally satisfied. In other words, it is fair to say that the uplink and downlink SEs grow without bound as M → ∞ in general, while the special cases when it does not occur are of no practical importance. We end this subsection with a comparison with related work and a remark regarding acquisition of channel statistics.

Example 3 (Comparison with [22], [23]). Consider a BS with two distributed arrays of M ′^ = M/ 2 antennas that serve two UEs having the covariance matrices

R 1 =

[

b 11 IM ′^0 0 b 12 IM ′

]

R 2 =

[

b 21 IM ′^0 0 b 22 IM ′

]

with b 11 , b 12 , b 21 , b 22 > 0. These covariance matrices are (asymptotically) linearly independent if b 11 b 22 6 = b 12 b 21 , in which case the uplink and downlink SEs grow without bound with MMSE or ZF.

UE in each cell uses the same pilot. Following the notation from [5], the received signal yj ∈ CM^ at BS j is

yj =

∑^ L

l=

∑^ K

i=

ρhjlixli + nj (28)

where ρ is the normalized transmit power, xli is the unit- power signal from UE i in cell l, hjli ∼ NC( 0 , Rjli) is the channel from this UE to BS j, Rjli ∈ CM^ ×M^ is the channel covariance matrix, and nj ∼ NC( 0 , IM ) is the independent receiver noise at BS j. Using a total uplink pilot power of ρtr per UE and standard MMSE estimation techniques [5], BS j obtains the estimate of hjli as

ˆhjli = RjliQ− ji^1

( ∑L

l′=

hjl′i +

ρtr^

nji

∼ NC ( 0 , Φjli)

(29)

where nji ∼ NC( 0 , IM ) is noise, Qji =

∑L

l′=1 Rjl′i^ +^

1 ρtr^ IM^ , and Φjli = RjliQ− ji^1 Rjli. The estimation error h˜jli = hjli −

h^ ˆjli ∼ NC ( 0 , Rjli − Φjli) is independent of hˆjli. However, the estimates hˆj 1 i,... , hˆjLi of the UEs with the same pilot are correlated as E{hˆjni ˆhH jmi} = RjniQ− ji^1 Rjmi.

A. Uplink Data Transmission

We denote by vjk ∈ CM^ the receive combining vector associated with UE k in cell j. Using the same technique as in [5], [26], the uplink ergodic capacity is lower bounded by

SEul jk =

τp τc

E

log 2

1 + γ jkul

[bit/s/Hz] (30)

with the instantaneous effective SINR

γul jk =

|vH jk ˆhjjk|^2

E

(l,i) 6 =(j,k)

|v jkHhjli|^2 + |vH jk ˜hjjk|^2 + vH jk vjk ρul

∣hˆ(j)

|vH jk ˆhjjk|^2

vH jk

(l,i) 6 =(j,k)

h^ ˆjli hˆH jli +^ Zj

vjk

where E{·|hˆ(j)} denotes the conditional expectation given the MMSE channel estimates available at BS ∑ j and Zj = L l=

∑K

i=1(Rjli^ −^ Φjli) +^

1 ρul^ IM^. As shown in [8], [9], the instantaneous effective SINR in (31) for UE k in cell j is maximized by

vjk =

( L

l=

∑^ K

i=

h^ ˆjli hˆH jli + Zj

h^ ˆjjk. (32)

We refer to this “optimal” receive combining scheme as mul- ticell MMSE (M-MMSE) combining. The “multicell” notion is used to differentiate it from the single-cell MMSE (S- MMSE) combining scheme [5]–[7], which is widely used in the literature and defined as

v ¯jk =

( K

i=

h^ ˆjji hˆH jji + Z¯j

h^ ˆjjk (33)

with Z¯j =

∑K

i=1 Rjji−Φjji^ +^

∑L

l=1,l 6 =j

∑K

i=1 Rjli^ +^ 1 ρul^ IM^. The main difference from (32) is that only channel estimates in the own cell are computed in S-MMSE, while hˆjli ˆhH jli − Φjli is replaced with its average (i.e., zero) for all l 6 = j. The computational complexity of S-MMSE is thus slightly lower than with M-MMSE (see [9] for a detailed discussion). However, both schemes only utilizes channel estimates that can be computed locally at the BS and the pilot overhead is identical since the same pilots are used to estimate both intra- cell and inter-cell channels. The S-MMSE scheme coincides with M-MMSE when there is only one isolated cell, but it is generally different and does not suppress interference from interfering UEs in other cells. Plugging (32) into (31) yields

γul jk = ˆhH jjk

(l,i) 6 =(j,k)

h^ ˆjli hˆH jli + Zj

h^ ˆjjk. (34)

We want to analyze γ jkul when M → ∞. To this end, we make the following two assumptions. Assumption 4. As M → ∞ ∀j, l, i, lim infM (^) M^1 tr(Rjli) > 0 and lim supM ‖Rjli‖ 2 < ∞. Assumption 5. For any UE k in cell j with λjk = [λj 1 k,... , λjLk]T^ ∈ RL^ and l′^ = 1,... , L

lim inf M inf {λjk : λjl′k =1}

M

∑^ L

l=

λjlkRjlk

2

F

The following is the fourth main result of the paper: Theorem 4. If M-MMSE combining is used, then under Assumptions 4 and 5 the SINR γ jkul increases a.s. unboundedly as M → ∞. Hence, SEul jk increases unboundedly as M → ∞. Proof: The proof is given in Appendix F. This theorem proves the remarkable result that, under As- sumptions 4 and 5, the uplink SE of a multicell Massive MIMO network increases without bound as M → ∞, despite pilot contamination. This is in sharp contrast to the finite limit in case of MR combining [1] or any other single- cell combining scheme [5]–[7] and it is due to the fact that M-MMSE rejects the coherent interference caused by pilot contamination when Assumptions 4 and 5 hold. Note that these are the natural multicell generalizations of Assumptions 1 and 2, respectively. In particular, the condition (35) says that the covariance matrices {Rjlk : l = 1,... , L} of the channels from the pilot-sharing UEs to BS j are asymptotically linearly independent, which implies the same condition for the estimated channels {ˆhjlk : l = 1,... , L}. This condition is used in Appendix F to prove Theorem 4 in a fairly simple way. However, we stress that Theorem 4 is valid also in a more general setting in which ˆhjjk is asymptotically linearly independent of the estimates of all pilot-interfering UEs’ channels, but some of the interfering channel estimates can be written as linear combinations of other interfering channels. Let Sjk ⊆ {ˆhjlk : ∀l 6 = j} denote a subset of the estimated interfering channels that form a basis for all interfering channels. Under these circumstances, we only need to take the estimates in Sjk into account in the computation of

the combining vector vjk in (32) and the same result follows. To gain further insights into this, we notice (as done for the two-user case in Section II-D) that one can find a receive combining vector that is orthogonal to the subspace spanned by Sjk. This scheme exhibits an unbounded SE when M → ∞ as it rejects the interference from all pilot-contaminating UEs (not only from those in Sjk), while retaining an array gain that grows with M. We call this scheme multicell ZF (M-ZF) and define it as vjk = Hˆjk

HH jk Hˆjk

e 1 , where e 1 is the first column of I|Sjk |+1 (with |Sjk| being the cardinality of Sjk) and Hˆjk ∈ CN^ ×(|Sjk^ |+1)^ is the matrix with hˆjjk in the first column and the channel estimates in Sjk in the remaining columns. Since M-MMSE combining is the optimal scheme, it has to exhibit an unbounded SE if this is the case with M-ZF.

B. Downlink Data Transmission

During downlink data transmission, the BS in cell l trans- mits xl =

ρdl^

∑K

l=1 wliςli, where^ ςli^ ∼ NC(0,^ 1)^ is the data signal intended for UE i in the cell and ρdl^ is the normalized transmit power. This signal is assigned to a transmit precoding vector wli ∈ CM^ , which satisfies E{‖wli‖^2 } = 1, such that E{‖wliςli‖^2 } = ρdl^ is the transmit power allocated to this UE. Using the same technique as in [5], [26], the downlink ergodic channel capacity of UE k in cell j can be lower bounded by SEdl jk =

1 − τ τpc

log 2 (1 + γ jkdl) [bit/s/Hz] with

γ jkdl =

|E{hH jjkwjk}|^2 ∑^ L l=

∑^ K

i=

E{|hH ljkwli|^2 } − |E{hH jjkwjk}|^2 + (^) ρ^1 dl

Unlike γul jk in (31), which only depends on the own combining vector vjk, γ jkdl depends on all precoding vectors {wli}. The precoding should ideally be selected jointly across the cells, which makes precoding optimization difficult in practice. Motivated by the uplink-downlink duality [9], it is reasonable to select {wli} based on the M-MMSE combining vectors {vjk} given by (32). This leads to M-MMSE precoding

wjk =

ϑjkvjk =

ϑjk

( L

l=

∑^ K

i=

h^ ˆjli hˆH jli + Zj

ˆhjjk

(37)

with the normalization factor ϑjk = (

E {‖vjk‖^2 })−^1. This is the fifth main result of the paper:

Theorem 5. If M-MMSE precoding is used, then under Assumptions 4 and 5 the SINR γ jkdl grows unboundedly as M → ∞. Hence, SEdl jk grows unboundedly as M → ∞.

Proof: Despite being much more involved, the proof basically unfolds from the same arguments used for proving Theorem 2 and by exploiting the results of Appendix F for Theorem 4. This theorem shows that an asymptotically unbounded downlink SE is achieved by all UEs in the network, despite the suboptimal assumptions of M-MMSE precoding, equal power allocation, and no estimation of the instantaneous realization of the precoded channels. The only important requirement is that the channel estimates to the desired UEs are asymptotically

linearly independent from the channel estimates of pilot- contaminating UEs in other cells. Section IV demonstrates numerically that the DL SE grows without bound as M → ∞.

C. Approximate M-MMSE Combining and Precoding In Section II-E, we have shown that the SE with the ap- proximate M-MMSE scheme (that only utilizes the diagonals of the covariance matrices) grows unbounded as M → ∞, in a two-user scenario. This result can be generalized to a multicell Massive MIMO network. Due to space limitations, we concentrate on the uplink. In particular, we assume that the signal of UE k in cell j is detected by using the approximate M-MMSE combining vector

vjk =

( L

l=

∑^ K

i=

h^ ˆjli hˆH jli + Sj

h^ ˆjjk (38)

where Sj =

∑L

l=

∑K

i=

Djli − DjliΛ− ji^1 Djli

  • (^) ρ^1 ul IM is a diagonal matrix and the EW-MMSE estimate of hjli is

ˆhjli = √^1 ρtr^

DjliΛ− ji^1

( L

l′=

hjl′i +

ρtr^

nji

where nji ∼ NC( 0 , IM ) is noise and Djli ∈ RM^ ×M^ and Λji ∈ RM^ ×M^ are diagonal with elements {[Rjli]nn : n = 1 ,... , M } and {

∑L

l′=1[Rjl′i]nn^ +^

1 ρtr^ :^ n^ = 1,... , M^ }, re- spectively. Since Djli and Λji are diagonal, the computational complexity of EW-MMSE estimation is substantially lower than for MMSE estimation; see [30] for details. Notice that the combining scheme in (38) can be applied without knowing the full channel covariance matrices, as it depends only on the diagonal elements of {Rjli : l = 1,... , L}. This is because the elements of hˆjli are estimated separately, without exploit- ing the spatial channel correlation. By using the use-and-then- forget SE bound [26], the uplink ergodic capacity of UE k in cell j can be lower bounded by SEul jk = (1 − τ τpc ) log 2 (1 + γul jk) [bit/s/Hz] with

γul jk = |E{vH jkhjjk}|^2 ∑^ L l=

∑^ K

i=

E{|vH jkhjli|^2 } − |E{v jkHhjjk}|^2 + (^) ρ^1 ul E{‖vjk‖^2 }

We now want to understand how γul jk behaves when M → ∞ under the following assumption, which is the extension of Assumption 5 to the case where only the diagonals of co- variance matrices are used for channel estimation and receive combining: Assumption 6. For any UE k in cell j with λjk = [λj 1 k,... , λjLk]T^ ∈ RL^ and l′^ = 1,... , L

lim inf M inf {λjk : λjl′k =1}

M

∑^ L

l=

λjlkDjlk

2

F

The following is the last main result of the paper:

101 102 103

0

1

2

Number of antennas ( M )

Spectral efficiency [bit/s/Hz/user]

(^) M−MMSE S−MMSE MRC M−ZF Time splitting

Fig. 4: Uplink SE as a function of M , for covariance matrices based on the exponential correlation model (r = 0. 5 ).

bound. The instantaneous effective SINR grows linearly with M , which is in line with Theorem 4, as seen from the fact that the SE grows linearly when the horizontal scale is logarithmic. M-ZF performs poorly because the channel estimates are so similar that full interference suppression removes most of the desired signal. In contrast, M-MMSE finds a non- trivial tradeoff between interference suppression and coherent combining of the desired signal, leading to superior SE. The reference curve “time splitting” considers the case when the 4 cells are active in different coherence blocks, to remove pilot contamination. MMSE combining is used and the SE grows without bound, but at a slower pace than with M-MMSE, due to the extra pre-log factor of 1 / 4. Hence, even for a small system with L = 4, it is inefficient to avoid pilot contamination by time splitting. Next, we consider the uncorrelated Rayleigh fading model in (42) with independent large-scale fading variations over the array. The uplink SE with M = 200 antennas and varying standard deviation σ from 0 to 5 is shown in Fig. 5(a). M-MMSE provides no benefit over S-MMSE or MR in the special case of σ = 0, where the covariance matrices are linearly dependent (i.e., scaled identity matrices). This is a special case that has received massive attention in academic literature, mainly because it simplifies the mathematical anal- ysis. However, M-MMSE provides substantial performance gains over S-MMSE and MR as soon as we depart from the scaled-identity model by adding small variations in the large-scale fading over the array, which make the covariance matrices linearly independent. This is in line with what we demonstrated in Example 2. As the variations increase, the SE with M-ZF improves particularly fast and approaches the SE with M-MMSE. M-ZF will never be the better scheme since M-MMSE is optimal. The motivation behind this simulation is the measurement results reported in [29], which show large- scale variations of around 4 dB over a massive MIMO array— this corresponds to σ ≈ 4 in our setup. Fig. 5(b) shows the received power (normalized by the noise power) after receive combining for an arbitrary UE when σ =

  1. It is divided into the desired signal power, the interference from UEs using the same pilot, and the interference from UEs using a different pilot. The figure shows that MR and S-MMSE suffer from strong interference from the UEs that use the same

0 1 2 3 4 5 0

1

2

3

4

Standard deviation of fading variations over the array

Spectral efficiency [bit/s/Hz/user]

M-MMSE M-ZF S-MMSE MR

(a) Uplink SE

Desired signal Interf: Same pilot Interf: Diff. pilot 0

5

10

15

20

25

Received power over noise power [dB]

MR S−MMSE M−MMSE M−ZF

(b) Received signal power after receive combining Fig. 5: Uplink with covariance matrices modeled by (42) for M = 200 and K = 2. (a) The SE as a function of the standard deviation σ of the large-scale fading variations. (b) The received power after receive combining with σ = 4 is separated into desired signal power and interference from UEs with the same or different pilot than the desired UE.

pilot, since these schemes are unable to mitigate the coherent interference caused by pilot contamination. In contrast, M- MMSE and M-ZF mitigate all types of interference and receive roughly the same amount of interference from UEs with the same or different pilots. Note that the price to pay for the interference rejection is a reduction in desired signal power when using M-MMSE and M-ZF.

B. Downlink The setup in Fig. 3 is also used in the downlink wherein we set ρdl^ = ρul^ to get the same SNRs as in the up- link. We consider a setup with both spatial channel corre- lation and large-scale fading variations over the array, such that the EW-MMSE estimator is suboptimal but Assump- tion 6 is satisfied. More precisely, we consider a combination of the exponential correlation model and (42): [R]m,n = βr|n−m|eı(n−m)θ^10 (fm+fn)/^20 , where θ is the AoA, r = 0. 5 is used as correlation factor, and f 1 ,... , fM ∼ N (0, σ^2 ) give independent large-scale fading variations over the array with σ = 4. The downlink SE is shown in Fig. 6 as a function of M , where Fig. 6(a) shows results with the MMSE estimator that uses the full channel covariance matrices and Fig. 6(b) shows results with the EW-MMSE estimator that only uses the diagonals of the covariance matrices. When using the

101 102 103

0

1

2

3

4

5

Number of antennas ( M )

Spectral efficiency [bit/s/Hz/user]

(^) M−MMSE M−ZF S−MMSE MRT

(a) MMSE estimation

101 102 103

0

1

2

3

4

5

Number of antennas ( M )

Spectral efficiency [bit/s/Hz/user]

(^) Approximate M−MMSE M−ZF Approximate S−MMSE MRT

(b) EW-MMSE estimation

Fig. 6: Downlink SE as a function of M for K = 2, when using either the MMSE estimator (with full covariance knowledge) or the EW- MMSE estimator (with known diagonals of the covariance matrices). The exponential correlation model with r = 0. 5 is used, but with large-scale fading variations over the array with σ = 4.

EW-MMSE estimator, we consider the approximate M-MMSE scheme in (38) and a corresponding approximation of S- MMSE, while M-ZF and MR are as before. The results in Fig. 6(a) with the MMSE estimator are similar to the uplink in Fig. 5(a): M-MMSE and M-ZF provide SEs that grow without bound, while the SEs with S-MMSE and MR converge to finite limits. In contrast to the uplink, M-MMSE and M-ZF precoding are both suboptimal in the downlink, but they can be shown to be asymptotically equal.^7 Interestingly, the same behaviors are observed in Fig. 6(b) when using the EW-MMSE estimator, which is a suboptimal estimator that neglects the off- diagonal elements of the covariance matrices. This result is in line with Theorem 6. There is a small SE loss (2%–4% for M- MMSE) compared to Fig. 6(a), but this is a minor price to pay for the greatly simplified acquisition of covariance information (estimating the entire diagonal is as simple as estimating a single parameter [30], [32]). We now increase the number of UEs per cell to K = 10, which leads to more interference but the same pilot contam- ination per UE. The UEs are uniformly and independently distributed in the cell-edge area, which is the shaded area in

(^7) For M-MMSE precoding in (37), Zj has bounded spectral norm while ∑ l

∑ i ˆhjli^ hˆH jli has^ LK^ eigenvalues that grow unboundedly as^ M^ → ∞. As the impact of Zj vanishes, the approach in [28] can be used to prove that M-MMSE approaches M-ZF asymptotically.

101 102 103

0

1

2

3

4

5

Number of antennas ( M )

Spectral efficiency [bit/s/Hz/user]

(^) M−MMSE M−ZF S−MMSE MRT

(a) MMSE estimation

101 102 103

0

1

2

3

4

5

Number of antennas ( M )

Spectral efficiency [bit/s/Hz/user]

(^) Approximate M−MMSE M−ZF Approximate S−MMSE MRT

(b) EW-MMSE estimation Fig. 7: Downlink SE as a function of M for K = 10 UEs that are uniformly distributed in the shaded cell edge area. The setup and covariance model are otherwise the same as in Fig. 6.

Fig. 3. The channel model is the same as in the previous figure. The downlink SE per UE is shown in Fig. 7 when using either MMSE or EW-MMSE estimation. The results resemble the ones for K = 2, but the curves are basically shifted to the right due to the additional interference. M-MMSE and M-ZF provide SEs that grow without bound, while the SE with S- MMSE and MR saturate, but more antennas are needed before reaching saturation.

V. CONCLUSIONS AND PRACTICAL IMPLICATIONS

We proved that the capacity of Massive MIMO systems increases without bound as M → ∞ in the presence of pilot contamination, despite the previous results that pointed toward the existence of a finite limit. This was achieved by showing that the conventional lower bounds on the capacity increase without bound when using M-MMSE precoding/combining. These schemes exploit the fact that the MMSE channel esti- mates of UEs that use the same pilot are linearly independent, due to their generally linearly independent covariance matri- ces. For our results to hold, the covariance matrices can have full rank and minor eigenvalue variations are sufficient. There are special cases where the channel covariance matrices are linearly dependent, but these are not robust to minor pertur- bations of the covariance matrices. Hence, they are anomalies that will never appear in practice or be drawn from a random distribution, although they have frequently been studied in the academic literature. Since the SE of MR (also known as

APPENDIX C – PROOF OF COROLLARY 1 IN APPENDIX B

Consider i = 1 and notice that the argument on the left-hand side of (48) is lower bounded as 1 M ‖R^1 +^ λ^2 R^2 ‖

2 F ( (^) ρ^1 tr + ‖R 1 + R 2 ‖ 2 )( (^) ρ^1 ul + ‖

k=1(Rk^ −^ Φk)‖^2 )^

by applying Lemma 4 twice. The denominator of (51) is bounded from above due to Assumption 1 and independent of λ 2. This proves that Assumption 2 is sufficient for (48) to hold for i = 1. The result for i = 2 follows by interchanging the indices in the proof.

APPENDIX D – PROOF OF THEOREM 2 We begin by plugging (13) into (12) to obtain

γdl 1 = |E {hH 1 v 1 } |^2 ϑ 2 ϑ 1 E^ {|h

H 1 v^2 | (^2) } + V{hH 1 v^1 }^ +^

1 ρdlϑ 1

We need to characterize all the terms in (52) and begin with E {hH 1 v 1 }. Notice that E {hH 1 v 1 } = E

hH 1 v 1

since v 1 is independent of the zero-mean error ˜h 1. Then, we can express h^ ˆH 1 v 1 as

ˆhH 1 v 1 =

hˆH 1

hˆ 2 hˆH 2 + Z

hˆ 1

1 + ˆhH 1

hˆ 2 ˆhH 2 + Z

hˆ 1

γ 1 ul 1 + γul 1

by first applying Lemma 5 and then identifying γul 1 in (7) in the

numerator and denominator. Theorem 1 proves that γ 1 ul M ^ δ^1 and applying this result to (53) yields ˆhH 1 v 1  1. By the dominated convergence theorem and the continuous mapping theorem [35], we then have that |E{hH 1 v 1 }|^2  1.

Consider now the noise term (^) ρdl^1 ϑ 1 = E{‖v^1 ‖

(^2) } ρdl^ where^ ϑ^1 = (E

‖v 1 ‖^2

)−^1. By applying Lemma 5 twice, we may rewrite ‖v 1 ‖^2 as

‖v 1 ‖^2 =

hˆH 1

ˆh 2 hˆH 2 + Z

ˆh 1 ( 1 + γ 1 ul

M

1 M hˆH 1

hˆ 2 hˆH 2 + Z

hˆ 1 ( (^1) M +^

1 M γ

ul 1

) 2.^ (54)

Let Re(·) denote the real-valued part of a scalar. The numer- ator in (54) can be expressed as

1 M

hˆH 1 Z−^2 hˆ 1 − 2 Re(^

1 M hˆH 1 Z

− (^1) hˆ 2 1 M hˆH 2 Z

− (^2) hˆ 1 ) 1 M +^

1 M hˆH 2 Z−^1 hˆ 2

1 M ˆhH 2 Z−^2 hˆ 2 | 1 M ˆhH 2 Z−^1 hˆ 1 |^2 ( (^1) M +^

1 M hˆH 2 Z−^1 hˆ 2

by applying again Lemma 5 twice. Under Assumption 1 and by applying Lemma 3,

1 M

ˆhH 1 Z

− (^2) hˆ 1 ^

M

tr(Φ 1 Z−^2 ) , β 11 ′ (56) 1 M

ˆhH 2 Z

− (^2) hˆ 2 ^

M

tr(Φ 2 Z−^2 ) , β 22 ′ (57) 1 M

ˆhH 1 Z

− (^2) hˆ 2 ^

M

tr(Υ 12 Z−^2 ) , β′ 12 (58)

where β 11 ′ , β 22 ′ , and β 12 ′ are non-negative real-valued scalars, since the trace of a product of positive semi-definite matrices is always non-negative. Therefore, we obtain 1 M

ˆhH 1

hˆ 2 ˆhH 2 + Z

hˆ 1  β′ 11 − 2 β^12 β

′ 12 β 22

β^212 β 12 ′ (β 22 )^2 , δ 1 ′. (59)

Plugging (59) into (54) and using γ

ul 1 M ^ δ^1 yields^ M^ ‖v^1 ‖

δ′ 1 δ^21 such that 1 ρdlϑ 1

E

‖v 1 ‖^2

ρdl^

M ρdl

δ 1 ′ δ^21

Consider now the two terms V{hH 1 v 1 } and ϑ ϑ^21 E

|hH 1 v 2 |^2

Similar to [5, Eq. (47)], we can upper bound V{hH 1 v 1 } as V{hH 1 v 1 } ≤ 2 E {|hH 1 v 1 − E {hH 1 v 1 }|} + E

hH 1 v 1

∣^2 }. Notice that (by using E {hH 1 v 1 }  1 and the dominated convergence theorem) E {|hH 1 v 1 − E {hH 1 v 1 }|}  0 and

E

hH 1 v 1

∣^2 }^ = E{v 1 H (R 1 − Φ 1 )v 1 } (a) ≤ ‖R 1 − Φ 1 ‖ 2 E

‖v 1 ‖^2

} (b)  0 (61) where (a) and (b) follow from Lemma 4 and E

‖v 1 ‖^2

(since, as shown above, ‖v 1 ‖^2  (^) M^1 δ

′ 1 δ^21 ^0 ), respectively. Therefore, we have that V{hH 1 v 1 }  0. Finally, we consider ϑ 2 ϑ 1 E

|hH 1 v 2 |^2

. By using (45), (46), and lim infM β 11 > 0 (as follows from Assumption 1), we have that

hH 1 v 2

(a)

hH 1

ˆh 1 ˆhH 1 + Z

ˆh 2

1 + hˆH 2

hˆ 1 hˆH 1 + Z

hˆ 2

(b)

1 M h

H 1 Z − (^1) hˆ 2 − M^1 hH 1 Z−^1 ˆh^1 M^1 hˆ^1 Z−^1 ˆhH 2 M^1 +^ M^1 hˆH 1 Z−^1 hˆ^1 1 M +^

1 M γ

ul 2 (c) 

β 12 − β^11 β 11 β^12 δ 2

where (a) and (b) follow from Lemma 5 after identifying^10 h^ ˆH 2

hˆ 1 hˆH 1 + Z

ˆh 2 as γul 2 (by also dividing and multiplying by M ), and (c) follows by using (44), (46) and the fact that γ 2 ul M

 δ 2 , β 22 − β^221 β 11

with lim infM δ 2 > 0 (which follows from the proof of Theorem 1 by interchanging UE indices). By applying Lemma 3, this implies E

|hH 1 v 2 |^2

 0. Observe now that ϑ ϑ^21  δ

′ 1 δ^21

δ^22 δ′ 2 where δ 2 ′ is obtained from δ′ 1 by interchanging UE indices. Since all the quantities in δ′ 1 are uniformly bounded (due to Assumption 1), lim infM δ 1 > 0 (as proved in Appendix B) and lim infM δ 2 < ∞ (since from (63) δ 2 < β 22 and lim infM β 22 < ∞ due to Assumption 1), we eventually have that ϑ ϑ^21 E

|hH 1 v 2 |^2

Combining all the above results yields γ 1 dl M

 ρdl^

δ 12 δ′ 1

(^10) The uplink SINR γ 2 ul of UE 2 is obtained from (7) by interchanging UE indices.

Since all the quantities in δ 1 ′ are uniformly bounded and lim infM δ 1 > 0 , it follows that γ 1 dl grows unboundedly as M → ∞. This implies that also SEdl 1 grows unboundedly as M → ∞, which can be proved by the same arguments as in the last paragraph of Appendix B.

APPENDIX E – PROOF OF THEOREM 3

The EW-MMSE estimate hˆk and the estimation error h˜k = hk − hˆk are random vectors distributed as hˆk ∼ NC( 0 , Σk) and ˜hk ∼ NC( 0 , Σ˜k) with Σk = DkΛ−^1 QΛ−^1 Dk and Σ^ ˜k = Rk − DkΛ−^1 Rk − RkΛ−^1 Dk − Σk. Unlike with MMSE estimation, the vectors ˆhk and h˜k are correlated with E{ˆhk h˜H k } = E{ˆhk(hk − hˆk)H} = DkΛ−^1 Rk − Σk. Hence, v 1 and ˜h 1 are also correlated. For later convenience, we also notice that E{h 1 hˆH 1 } = R 1 Λ−^1 D 1 , E{h 1 hˆH 2 } = R 1 Λ−^1 D 2 , and E{ˆh 2 hˆH 1 } = D 2 Λ−^1 QΛ−^1 D 1 = Θ 21. We need to characterize all the terms in (25) and begin with E {vH 1 h 1 }. By applying Lemma 5 and by dividing and multiplying by M , we can express vH 1 h 1 as

vH 1 h 1 =

1 M hˆH 1

ˆh 2 hˆH 2 + S

h 1

1 M +^

1 M ˆhH 1

hˆ 2 hˆH 2 + S

ˆh 1

1 M μ˜

ul 1 1 M +^

1 M μ

ul 1

Notice that μul 1 has the same form as γul 1 in (7), but with {hˆk : k = 1, 2 } now given by (24). Under Assumption 1 and by Lemma 3,^11

M

hˆH 1 S−^1 ˆh 1  1 M

tr(Σ 1 S−^1 ) =

M

∑^ M

i=

[R 1 ]^2 i,i [S]i,i[Λ]i,i

, α 11

(66) 1 M

hˆH 2 S−^1 ˆh 2  1 M

tr(Σ 2 S−^1 ) =

M

∑^ M

i=

[R 2 ]^2 i,i [S]i,i[Λ]i,i

, α 22

(67) 1 M

hˆH 1 S−^1 ˆh 2  1 M

tr(Θ 21 S−^1 )

M

∑^ M

i=

[R 1 ]i,i[R 2 ]i,i [S]i,i[Λ]i,i

, α 12. (68)

By applying the same line of reasoning as when analyzing γ 1 ul in Appendix B and exploiting the fact that lim infM α 22 >

0 (which follows from Assumption 1), we obtain μ

ul 1 M = 1 M ˆhH 1

hˆ 2 hˆH 2 + S

h 1  υ 1 , α 11 − α

(^212) α 22.^ Note^ that lim infM υ 1 > 0 under Assumption 3. This can be proved, as done in Appendix B for δ 1 , by expanding the condition reported in the corollary below (the proof unfolds from the same arguments as in Appendix C).

(^11) The expressions in (66)–(68) have been simplified by utilizing the fact that Q and Λ have the same diagonal elements and Rk and Dk have the same diagonal elements, for k = 1, 2.

Corollary 2. If Assumption 3 holds, then for λ = [λ 1 , λ 2 ]T^ ∈ R^2 and i = 1, 2 ,

lim inf M inf {λ: λi=1} 1 M tr

Λ−^1 QΛ−^1

λ 1 D 1 + λ 2 D 2

S−^1

λ 1 D 1 + λ 2 D 2

As for ˜μul 1 in (65), we have that 1 M

μ˜ul 1 =

M

hˆH 1

hˆ 2 hˆH 2 +^ S

h 1

M

hˆH 1 S−^1 h 1 −

1 M ˆhH 1 S−^1 hˆ 2 1 M hˆH 2 S−^1 h 1 1 M +^

1 M hˆH 2 S−^1 hˆ 2

 υ 1 (70) since the diagonal structure of the matrices Λ, D 1 , D 2 , and S implies that 1 M

hˆH 1 S−^1 h 1  1 M

tr(R 1 Λ−^1 D 1 S−^1 )

M

∑^ M

i=

[R 1 ]^2 i,i [S]i,i[Λ]i,i

= α 11 (71)

1 M

hˆH 2 S−^1 h 1  1 M

tr(R 1 Λ−^1 D 2 S−^1 )

M

∑^ M

i=

[R 1 ]i,i[R 2 ]i,i [S]i,i[Λ]i,i

= α 12. (72)

Applying these results to (65) yields vH 1 h 1  1 from which it follows that |E{vH 1 h 1 }|^2  1. Next, consider the noise term (^) ρ^1 ul E

||v 1 ||^2

for which

‖v 1 ‖^2 =

hˆH 1

hˆ 2 ˆhH 2 + S

hˆ 1 ( 1 + μul 1

M

1 M hˆH 1

ˆh 2 ˆhH 2 + S

hˆ 1 ( (^1) M +^

1 M μ ul 1

) 2.^ (77)

Under Assumption 1 and by Lemma 3,

1 M

hˆH 1 S−^2 ˆh 1  1 M

tr(Σ 1 S−^2 ) =

M

∑^ M

i=

[R 1 ]^2 i,i [S]^2 i,i[Λ]i,i

, α′ 11

(78) 1 M

hˆH 2 S−^2 ˆh 2  1 M

tr(Σ 2 S−^2 ) =

M

∑^ M

i=

[R 2 ]^2 i,i [S]^2 i,i[Λ]i,i

, α′ 22

(79) 1 M

hˆH 1 S−^2 ˆh 2  1 M

tr(Θ 21 S−^2 )

M

∑^ M

i=

[R 1 ]i,i[R 2 ]i,i [S]^2 i,i[Λ]i,i

, α′ 12 (80)

where α′ 11 , α′ 22 , and α′ 12 are non-negative real-valued scalars. By applying Lemma 5 twice to the numerator in (77) and by using the above results, we obtain 1 M

ˆhH 1

hˆ 2 ˆhH 2 + S

hˆ 1  α′ 11 − 2 α^12 α

′ 12 α 22

α^212 α′ 22 (α 22 )^2

, υ 1 ′. (81)

Corollary 3. If Assumption 5 holds, then for any UE k in cell j with λjk = [λj 1 k,... , λjLk]T^ ∈ RL^ and l′^ = 1,... , L

lim inf M

inf {λjk :λjl′k =1}

1 M

tr

Q− jk^1

( ∑L

l=

λjlkRjlk

Z− j^1

( ∑L

l=

λjlkRjlk

and the matrix Cjk is invertible as M → ∞.

Since Cjk is invertible as M → ∞ under Assumption 5,

we have that γul jk M in (74) is such that γ jkul M

 δjk , βjj,jk − bH jkC− jk^1 bjk. (90)

Expanding condition (89) in Corollary 3 for l′^ = j and using the definitions of bjk and Cjk yield

lim inf M inf λjk

βjj,jk + 2λ

T jkbjk^ +^ λ

T jkCjkλjk

with λjk = [λj 1 k,... , λj(j−1)k, λj(j+1)k,... , λjLk]T^ ∈ RL−^1. The invertibility of Cjk as M → ∞ ensures that the infimum exists for sufficiently large M and that it is given by

inf λjk

βjj,jk + 2λ

T jkbjk^ +^ λ

T jkCjkλjk

= βjj,jk − bT jkC− jk^1 bjk = δjk (92)

where the infimum is attained by λjk = C− jk^1 bjk. Substituting (92) into (91) implies that lim infM δjk > 0. Therefore, γ jkul grows a.s. unboundedly and this implies that SEul jk grows unboundedly as M → ∞, which can be proved as done in the last paragraph of Appendix B.

APPENDIX G – PROOF OF COROLLARY 3 IN APPENDIX F The argument of the left-hand side of (89) can be lower bounded by

1 M

∥ ∑L

l=1 λjlkRjlk

∥^2

( F

1 ρtr^ +^

∥ ∑Ll=1 Rjlk

2

ρul^ +^

∥ ∑Ll=

Rjlk − Φjlk

2

by applying Lemma 4 twice. Notice that the denominator is bounded due to Assumption 5 and independent of {λljk}. Therefore, if (35) holds, it follows from (93) that (89) also holds. We now exploit (89) to prove that Cjk is invertible for sufficiently large M. To this end, observe that Cjk with entries given by (88) is a Gramian matrix obtained as the inner products of the vectors {ujlk = vec

√ M Z

− 1 / 2 j RjlkQ

− 1 / 2 jk

∀l 6 = j}. Therefore, as M grows large the matrix Cjk is invertible if and only if the vectors {ujlk : ∀l 6 = j} are asymptotically linearly independent. Notice that the condition (89) in Corollary 3 for l′^ = j can be rewritten in compact form as

lim inf M

inf {λjk :λjjk =1}

ujjk +

l 6 =j λjlkujlk

)H

ujjk +

l 6 =j λjlkujlk

which implies that the vectors {ujlk : ∀l} are asymptoti- cally linearly independent. Since any subset of a finite set with linearly independent vectors is also linearly independent, (94) ensures that {ujlk : ∀l 6 = j} are also asymptotically linearly independent. This proves that, under Assumption 5, the Gramian matrix Cjk is invertible as M → ∞ and this completes the proof.

REFERENCES

[1] T. L. Marzetta, “Noncooperative cellular wireless with unlimited num- bers of base station antennas,” IEEE Trans. Wireless Commun., vol. 9, no. 11, pp. 3590–3600, Nov. 2010. [2] E. G. Larsson, F. Tufvesson, O. Edfors, and T. L. Marzetta, “Massive MIMO for next generation wireless systems,” IEEE Commun. Magazine, vol. 52, no. 2, pp. 186–195, Feb. 2014. [3] J. G. Andrews, S. Buzzi, W. Choi, S. V. Hanly, A. Lozano, A. C. K. Soong, and J. C. Zhang, “What will 5G be?” IEEE J. Sel. Areas Commun., vol. 32, no. 6, pp. 1065–1082, Jun. 2014. [4] E. G. Larsson and L. V. der Perre, “Massive MIMO for 5G,” IEEE 5G Tech Focus, vol. 1, no. 1, 2017. [5] J. Hoydis, S. Ten Brink, and M. Debbah, “Massive MIMO in the UL/DL of cellular networks: How many antennas do we need?” IEEE J. Sel. Areas Commun., vol. 31, no. 2, pp. 160–171, Feb. 2013. [6] K. Guo, Y. Guo, G. Fodor, and G. Ascheid, “Uplink power control with MMSE receiver in multi-cell MU-massive-MIMO systems,” in Proc. IEEE ICC, 2014, pp. 5184–5190. [7] N. Krishnan, R. D. Yates, and N. B. Mandayam, “Uplink linear receivers for multi-cell multiuser MIMO with pilot contamination: large system analysis,” IEEE Trans. Wireless Commun., vol. 13, no. 8, pp. 4360–4373, Aug. 2014. [8] H. Ngo, M. Matthaiou, and E. G. Larsson, “Performance analysis of large scale MU-MIMO with optimal linear receivers,” in Proc. IEEE Swe-CTW, 2012, pp. 59–64. [9] X. Li, E. Bj¨ornson, E. G. Larsson, S. Zhou, and J. Wang, “Massive MIMO with multi-cell MMSE processing: Exploiting all pilots for in- terference suppression,” EURASIP Journal on Wireless Communications and Networking, no. 117, Jun. 2017. [10] H. Yin, D. Gesbert, M. Filippou, and Y. Liu, “A coordinated approach to channel estimation in large-scale multiple-antenna systems,” IEEE J. Sel. Areas Commun., vol. 31, no. 2, pp. 264–273, Feb. 2013. [11] A. Adhikary, J. Nam, J.-Y. Ahn, and G. Caire, “Joint spatial division and multiplexing—the large-scale array regime,” IEEE Trans. Inf. Theory, vol. 59, no. 10, pp. 6441–6463, Oct. 2013. [12] L. You, X. Gao, X.-G. Xia, N. Ma, and Y. Peng, “Pilot reuse for massive MIMO transmission over spatially correlated rayleigh fading channels,” IEEE Trans. Wireless Commun., vol. 14, no. 6, pp. 3352–3366, Jun.

[13] X. Gao, O. Edfors, F. Rusek, and F. Tufvesson, “Massive MIMO performance evaluation based on measured propagation data,” IEEE Trans. Wireless Commun., vol. 14, no. 7, pp. 3899–3911, Jul. 2015. [14] M. Li, S. Jin, and X. Gao, “Spatial orthogonality-based pilot reuse for multi-cell massive MIMO transmission,” in Proc. WCSP, 2013. [15] H. Q. Ngo and E. Larsson, “EVD-based channel estimations for multicell multiuser MIMO with very large antenna arrays,” in Proc. IEEE ICASSP,

[16] R. M¨uller, L. Cottatellucci, and M. Vehkaper¨a, “Blind pilot decontami- nation,” IEEE J. Sel. Topics Signal Process., vol. 8, no. 5, pp. 773–786, Oct. 2014. [17] D. Hu, L. He, and X. Wang, “Semi-blind pilot decontamination for massive MIMO systems,” IEEE Trans. Wireless Commun., vol. 15, no. 1, pp. 525–536, Jan. 2016. [18] H. Yin, L. Cottatellucci, D. Gesbert, R. R. M¨uller, and G. He, “Robust pilot decontamination based on joint angle and power domain discrim- ination,” IEEE Trans. Signal Process., vol. 64, no. 11, pp. 2990–3003, Jun. 2016. [19] J. Vinogradova, E. Bj¨ornson, and E. G. Larsson, “On the separability of signal and interference-plus-noise subspaces in blind pilot decontam- ination,” in Proc. IEEE ICASSP, 2016. [20] J. Zhang, B. Zhang, S. Chen, X. Mu, M. El-Hajjar, and L. Hanzo, “Pilot contamination elimination for large-scale multiple-antenna aided OFDM systems,” IEEE J. Sel. Topics Signal Process., vol. 8, no. 5, pp. 759–772, Oct. 2014.

[21] T. X. Vu, T. A. Vu, and T. Q. S. Quek, “Successive pilot contamination elimination in multiantenna multicell networks,” IEEE Wireless Com- mun. Lett., vol. 3, no. 6, pp. 617–620, Dec. 2014. [22] A. Ashikhmin and T. Marzetta, “Pilot contamination precoding in multi- cell large scale antenna systems,” in IEEE International Symposium on Information Theory Proceedings (ISIT), 2012, pp. 1137–1141. [23] L. Li, A. Ashikhmin, and T. Marzetta, “Pilot contamination precoding for interference reduction in large scale antenna systems,” in Allerton, 2013, pp. 226–232. [24] E. Bj¨ornson, J. Hoydis, and L. Sanguinetti, “Pilot contamination is not a fundamental asymptotic limitation in massive MIMO,” in Proc. IEEE ICC, 2017. [25] D. Neumann, M. Joham, and W. Utschick, “On MSE based receiver design for massive MIMO,” in Proc. SCC, 2017. [26] T. L. Marzetta, E. G. Larsson, H. Yang, and H. Q. Ngo, Fundamentals of Massive MIMO. Cambridge University Press, 2016. [27] S. M. Kay, Fundamentals of Statistical Signal Processing: Estimation Theory. Upper Saddle River, NJ, USA: Prentice-Hall, Inc., 1993. [28] E. Bj¨ornson, M. Bengtsson, and B. Ottersten, “Optimal multiuser trans- mit beamforming: A difficult problem with a simple solution structure,” IEEE Signal Process. Mag., vol. 31, no. 4, pp. 142–148, Jul. 2014. [29] X. Gao, O. Edfors, F. Tufvesson, and E. G. Larsson, “Massive MIMO in real propagation environments: Do all antennas contribute equally?” IEEE Trans. Commun., vol. 63, no. 11, pp. 3917–3928, Nov. 2015. [30] N. Shariati, E. Bj¨ornson, M. Bengtsson, and M. Debbah, “Low- complexity polynomial channel estimation in large-scale MIMO with arbitrary statistics,” IEEE J. Sel. Topics Signal Process., vol. 8, no. 5, pp. 815–830, Oct. 2014. [31] C. Sun, X. Gao, S. Jin, M. Matthaiou, Z. Ding, and C. Xiao, “Beam divi- sion multiple access transmission for massive MIMO communications,” IEEE Trans. Commun., vol. 63, no. 6, pp. 2170–2184, Jun. 2015. [32] E. Bj¨ornson, L. Sanguinetti, and M. Debbah, “Massive MIMO with imperfect channel covariance information,” in Proc. ASILOMAR, 2016. [33] S. Haghighatshoar and G. Caire, “Massive MIMO pilot decontamination and channel interpolation via wideband sparse channel estimation,” CoRR, vol. abs/1702.07207, 2017. [34] S. Loyka, “Channel capacity of MIMO architecture using the exponential correlation matrix,” IEEE Commun. Lett., vol. 5, no. 9, pp. 369–371, Sep. 2001. [35] R. Couillet and M. Debbah, Random matrix methods for wireless communications. Cambridge University Press, 2011. [36] A. W. Marshall, I. Olkin, and B. C. Arnold, Inequalities: theory of majorization and its applications, ser. Springer series in statistics. New York: Springer, 2011.

Emil Bj¨ornson (S’07, M’12) received the M.S. de- gree in Engineering Mathematics from Lund Univer- sity, Sweden, in 2007. He received the Ph.D. degree in Telecommunications from KTH Royal Institute of Technology, Sweden, in 2011. From 2012 to mid 2014, he was a joint postdoc at the Alcatel-Lucent Chair on Flexible Radio, SUPELEC, France, and at KTH. He joined Linkping University, Sweden, in 2014 and is currently Senior Lecturer and Docent at the Division of Communication Systems. He performs research on multi-antenna commu- nications, Massive MIMO, radio resource allocation, energy-efficient com- munications, and network design. He is on the editorial board of the IEEE TRANSACTIONS ON COMMUNICATIONS and the IEEE TRANSACTIONS ON GREEN COMMUNICATIONS AND NETWORKING. He is the first author of the textbooks Massive MIMO Networks: Spectral, Energy, and Hardware Efficiency (2017) and Optimal Resource Allocation in Coordinated Multi-Cell Systems (2013). He is dedicated to reproducible research and has made a large amount of simulation code publicly available. Dr. Bjrnson has performed MIMO research for more than ten years and has filed more than ten related patent applications. He received the 2016 Best PhD Award from EURASIP, the 2015 Ingvar Carlsson Award, and the 2014 Outstanding Young Researcher Award from IEEE ComSoc EMEA. He has co-authored papers that received best paper awards at WCSP 2017, IEEE ICC 2015, IEEE WCNC 2014, IEEE SAM 2014, IEEE CAMSAP 2011, and WCSP 2009.

Jakob Hoydis (S’08–M’12) received the diploma degree (Dipl.-Ing.) in electrical engineering and in- formation technology from RWTH Aachen Univer- sity, Germany, and the Ph.D. degree from Sup´elec, Gif-sur-Yvette, France, in 2008 and 2012, respec- tively. He is a member of technical staff at Nokia Bell Labs, France, where he is investigating ap- plications of deep learning for the physical layer. Previous to this position he was co-founder and CTO of the social network SPRAED and worked for Alcatel-Lucent Bell Labs in Stuttgart, Germany. His research interests are in the areas of machine learning, cloud computing, SDR, large random matrix theory, information theory, signal processing, and their applications to wireless communications. He is a co-author of the textbook Massive MIMO Networks: Spectral, Energy, and Hardware Efficiency (2017). He is recipient of the 2012 Publication Prize of the Sup´elec Foundation, the 2013 VDE ITG F¨orderpreis, and the 2015 Leonard G. Abraham Prize of the IEEE COMSOC. He received the IEEE WCNC 2014 best paper award and has been nominated as an Exemplary Reviewer 2012 for the IEEE Communication letters.

Luca Sanguinetti (SM’15) received the Laurea Telecommunications Engineer degree (cum laude) and the Ph.D. degree in information engineering from the University of Pisa, Italy, in 2002 and 2005, respectively. Since 2005 he has been with the Dipartimento di Ingegneria dell’Informazione of the University of Pisa. In 2004, he was a visiting Ph.D. student at the German Aerospace Center (DLR), Oberpfaffenhofen, Germany. During the period June 2007 - June 2008, he was a postdoctoral associate in the Dept. Electrical Engineering at Princeton. During the period June 2010 - Sept. 2010, he was selected for a research assistantship at the Technische Universitat Munchen. From July 2013 to October 2017 he was with Large Systems and Networks Group (LANEAS), CentraleSup´elec, Gif-sur-Yvette, France. Dr. Sanguinetti is currently serving as an Associate Editor for the IEEE SIGNAL PROCESSING LETTERS. He served as an Associate Editor for IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, and as Lead Guest Edi- tor of IEEE JOURNAL ON SELECTED AREAS OF COMMUNICATIONS Special Issue on “Game Theory for Networks” and as an Associate Editor for IEEE JOURNAL ON SELECTED AREAS OF COMMUNICATIONS (series on Green Communications and Networking). Dr. Sanguinetti served as Exhibit Chair of the 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) and as the general co-chair of the 2016 Tyrrhenian Workshop on 5G&Beyond. He is a co-author of the textbook Massive MIMO Networks: Spectral, Energy, and Hardware Efficiency (2017). His expertise and general interests span the areas of communications and signal processing, game theory and random matrix theory for wireless communications. He was the co-recipient of two best paper awards: IEEE Wireless Commun. and Networking Conference (WCNC) 2013 and IEEE Wireless Commun. and Networking Conference (WCNC) 2014. He was also the recipient of the FP7 Marie Curie IEF 2013 “Dense deployments for green cellular networks”.