Typesetting math: 89%

Sync-Rank: Robust Ranking, Constrained Ranking and Rank Aggregation via Eigenvector and SDP Synchronization

Publisher: IEEE

Abstract:
We consider the classical problem of establishing a statistical ranking of a set of n items given a set of inconsistent and incomplete pairwise comparisons between such items. Instantiations of this problem occur in numerous applications in data analysis (e.g., ranking teams in sports data), computer vision, and machine learning. We formulate the above problem of ranking with incomplete noisy information as an instance of the group synchronization problem over the group SO(2) of planar rotations, whose usefulness has been demonstrated in numerous applications in recent years in computer vision and graphics, sensor network localization and structural biology. Its least squares solution can be approximated by either a spectral or a semidefinite programming (SDP) relaxation, followed by a rounding procedure. We perform extensive numerical simulations on both synthetic and real-world data sets, which show that our proposed method compares favorably to other ranking methods from the recent literature. Existing theoretical guarantees on the group synchronization problem imply lower bounds on the largest amount of noise permissible in the data while still achieving an approximate recovery of the ground truth ranking. We propose a similar synchronization-based algorithm for the rank-aggregation problem, which integrates in a globally consistent ranking many pairwise rank-offsets or partial rankings, given by different rating systems on the same set of items, an approach which yields significantly more accurate results than other aggregation methods, including Rank-Centrality, a recent state-of-the-art algorithm. Furthermore, we discuss the problem of semi-supervised ranking when there is available information on the ground truth rank of a subset of players, and propose an algorithm based on SDP which is able to recover the ranking of the remaining players, subject to such hard constraints. Finally, synchronization-based ranking, combined with a spectral technique for the de...
Published in: IEEE Transactions on Network Science and Engineering ( Volume: 3, Issue: 1, Jan.-March 1 2016)
Page(s): 58 - 79
Date of Publication: 29 January 2016
ISSN Information:
INSPEC Accession Number: 15823856
Publisher: IEEE
Funding Agency:

SECTION 1

Introduction

The problem of ranking a set of n players from incomplete and noisy cardinal or ordinal pairwise measurements has received a great deal of attention in the recent literature. Most often, the data is incomplete, especially when n is large, but also very noisy, with a large fraction of the pairwise measurements being both incorrect and inconsistent with respect to the existence of an underlying total ranking. The goal is to recover a global ranking of the players that is consistent with the given data as best as possible.

The analysis of many modern large-scale data sets implicitly requires various forms of ranking to allow for the identification of the most important entries or features. A myriad of such instances appear throughout various disciplines, including internet-related applications such as the famous Google search engine [1], eBay's feedback-based reputation mechanism [2], Amazon's Mechanical Turk system, or the Netflix recommendation system [3].

Traditional ranking methods, most often coming from the social choice theory literature, have proven less efficient in dealing with nowadays data. Most of this literature has been developed with ordinal comparisons in mind, while much of the current data deals with cardinal (numerical) scores for the pairwise comparisons, such as sports data, where the outcome of a match is often a score.

There exists a very rich literature on ranking, and it would be impossible to give a fair account of it here. Instead, we will mention some of the methods that relate to our work, and especially methods we compare against. A very common approach in the ranking literature is to treat the input rankings as data generated from a probabilistic model, and then learn the Maximum Likelihood Estimator of the input data, an idea which has been explored in depth in the machine learning community, and includes the Bradley-Terry-Luce model (BTL) [4] or the Plackett-Luce (PL) model [5]. Soufiani et al. propose an efficient Generalized Method-of-Moments algorithm for computing parameters of the PL model, by breaking the full rankings into pairwise comparisons, and then computing the parameters that satisfy a set of generalized moment conditions [6]. The recent work of [7] for ranking from a random sample of ordinal comparisons, also accounts for whether one seeks an approximately uniform quality across the ranking, or more accuracy near the top or the bottom.

The general idea of angular embedding, which we exploit in this paper, is not new, and aside from recent work by Singer in the context of the angular synchronization problem (ASP) [8], has also been explored by Yu [9], who observed that embedding in the angular space is significantly more robust to outliers when compared to embedding in the usual linear space. In the latter case, the traditional least squares formulation, or its l1 norm formulation, cannot match the performance of the angular embedding approach, thus suggesting that the key to overcoming noise comes not with imposing additional constraints on the solution, but by penalizing the inconsistencies between the measurements. Yu's proposed spectral method returns very satisfactory results in terms of robustness to noise for an image reconstruction problem. Recently, Osting et al. [10] propose an l1-norm formulation for ranking with cardinal data, as an alternative to a previous l2-norm formulation [11].

Contribution of our paper. The contribution of our proposed Sync-Rank algorithm can be summarized as follows.

  • We make an explicit connection between ranking and the angular synchronization problem, and use existing spectral and SDP relaxations for the latter problem to compute robust global rankings.

  • We perform a very extensive set of numerical simulations comparing our proposed method with existing state-of-the-art algorithms from the ranking literature, across a variety of synthetic measurement graphs and noise models, both for numerical (cardinal) and binary (ordinal) pairwise comparisons. In addition, we compare the algorithms on two real data sets: scores of soccer games in the English Premier League, and NCAA College Basketball games. Overall, we compare (in most instances, favorably) to the two recently proposed state-of-the-art algorithms, Serial-Rank [12], and Rank-Centrality [13].

  • Furthermore, we propose and compare to a very simple ranking method based on Singular Value Decomposition (SVD) (that we currently investigate in a separate ongoing work1), which may be of independent interest as its performance is comparable to that of the recent Serial-Rank algorithm, and superior to it in the setting where only the magnitude of the rank offset is available, but not its sign.

  • We propose a method for ranking in the semi-supervised setting where a subset of the players have a prescribed rank to be enforced as a hard constraint.

  • We show that our approach is applicable to the rank aggregation problem of integrating ranking information from multiple rating systems that provide independent, incomplete and inconsistent pairwise comparisons for the same set of players.

  • Finally, we show that by combining Sync-Rank with recent algorithms for the densest subgraph problem , we are able to identify planted partial rankings, which other methods are not able to extract.

The advantage of Sync-Rank stems from the fact that it is a computationally simple, algorithm that is model independent and relies exclusively on the available (ordinal or cardinal) data. Existing theoretical guarantees from the recent literature on the group synchronization problem [8], [14] trivially translate to lower bounds for the largest amount of noise permissible in the measurements that still allows for an exact or almost exact recovery. 2

The remainder of this paper is organized as follows. Section 2 summarizes related methods against which we compare. Section 3 is a review of the angular synchronization problem and existing results from the literature. Section 4 describes the Sync-Rank algorithm for ranking via eigenvector and SDP-based synchronization. Section 5 provides an extensive numerical comparison of Sync-Rank with other recent algorithms, across a variety of synthetic and real data sets. Section 6 considers the rank aggregation problem, which we solve efficiently via the same spectral and SDP relaxations of the angular synchronization problem. In Section 7 we consider the constrained ranking problem and propose a modified SDP-based synchronization algorithm for it. Section 8 summarizes several open problems related to ranking, on which we expand upon in Appendix D. Section 9 is a summary and discussion. In Appendix A, we proposes an algorithm for extracting partial rankings from comparison data. Appendix B summarizes the recent Serial-Rank algorithm. Appendix C summarizes the Rank Centrality algorithm, and our proposed version for a single rating system. In Appendix E we show recovered rankings for the English Premier League 2013-2014 season.

SECTION 2

Related Methods

2.1 Serial Rank and Generalized Linear Models

In very recent work [12], Fogel et al. propose a seriation algorithm for ranking a set of players given noisy incomplete pairwise comparisons. The gist of their approach is to assign similar rankings to players that compare similarly with all other players. They do so by constructing a similarity matrix from the available pairwise comparisons, relying on existing seriation methods to reorder the similarity matrix and thus recover the final rankings. The authors make an explicit connection between the ranking problem and another related classical ordering problem, namely seriation, where one is given a similarity matrix between a set of n items and assumes that the items have an underlying ordering on the line such that the similarity between items decreases with their distance. In other words, the more similar two items are, the closer they should be in the proposed solution. By and large, the goal of the seriation problem is to recover the underlying linear ordering based on unsorted, inconsistent and incomplete pairwise similarity information. We briefly summarize their approach in Appendix B, and note that it closely resembles our proposed SVD-Rank algorithm.

2.2 The Rank-Centrality Algorithm

Negahban et al. [13] recently proposed an iterative algorithm for the rank aggregation problem of integrating ranking information from multiple ranking systems, by estimating scores for the items from the stationary distribution of a certain random walk on the graph of items, where each edge encodes the outcome of pairwise comparisons. We summarize their approach in Appendix C, and compare against it in the setting of the rank aggregation problem. However, for the case of a single rating system, we propose and compare to a variant of their algorithm, which we also detail in the same Appendix C.

2.3 Ranking via Singular Value Decomposition

An additional ranking method we propose, and compare against, is based on the traditional Singular Value Decomposition method. The applicability of the SVD-Rank approach stems from the observation that, in the case of cardinal measurements (25), the noiseless matrix of rank offsets C is a skew-symmetric matrix of even rank 2 since

R=reTerT,(1)
View Source where e denotes the all-ones column vector. In the noisy case, C is a random perturbation of a rank-2 matrix. We consider the top two singular vectors of C, order their entries by their size, extract the resulting rankings, and choose between the first and second singular vector based on whichever one minimizes the number of upsets. Note that since the singular vectors are obtained up to a global sign, we choose the ordering which minimizes the number of upsets. Though a rather naive approach, SVD-Rank returns, under the multiplicative uniform noise model, results that are comparable to those of very recent algorithms such as Serial-Rank [12] and Rank-Centrality [13] . A previous application of SVD to ranking has been explored in Gleich [15] , for studying relational data as well as developing a method for interactive refinement of the search results. To the best of our knowledge, we are not aware of other work that considers SVD-based ranking for the setting considered in this paper. An interesting research direction we discuss in Appendix D.1, is to analyze the performance of SVD-Rank using tools from the random matrix theory literature on rank-2 deformations of random matrices.

The SVD-Rank approach is similar to that of the Serial-Rank algorithm detailed in Appendix B, since the singular vectors of C are nothing but eigenvectors of CCT, the only differences being the addition of a multiple of the all ones matrix as in (42) and working with the Fiedler vector of the resulting Laplacian matrix. Finally, we mention that in ongoing work1 we use SVD-Rank in the context of the seriation problem, where one aims to recover the global ordering of the items (up to a global sign) using only information on the absolute value of the rank offsets (see Section D.3 for additional details), and note that SVD-Rank performs significantly better than its Serial-Rank counterpart.

2.4 Ranking via Least Squares

We also compare our proposed ranking method with the more traditional least-squares approach. Assuming the number of edges in G is given by m=|E(G)|, we denote by B the edge-vertex incidence matrix of size m×n

Bij=110if(i,j)E(G),if(i,j)E(G),if(i,j)E(G),andi>jandi<j(2)
View Source and by y the vector of length m×1 which contains the pairwise rank measurements y(e)=Cij, for all edges e=(i,j)E(G). We obtain the least-squares solution to the ranking problem by solving the following minimization
minimizexRn||Bxy||22.(3)
View Source
We point out here the work of Hirani et al. [16], who show that the problem of least-squares ranking on graphs has far-reaching rich connections with various other research areas, including spectral graph theory and multilevel methods for graph Laplacian systems, and Hodge decomposition theory.

SECTION 3

The Group Synchronization Problem

Finding group elements from noisy measurements of their ratios is known as the group synchronization problem. For example, the synchronization problem over the special orthogonal group SO(d) consists of estimating a set of n unknown d×d matrices R1,,RnSO(d) from noisy measurements Rij of a subset of their pairwise ratios R1iRj. The least squares solution to synchronization aims to minimize the sum of squared deviations

minimizeR1,,RnSO(d)(i,j)EwijR1iRjRij2F,(4)
View Source where |||| denotes the Frobenius norm, and wij are non-negative weights representing the confidence in the noisy pairwise measurements Rij . Spectral and semidefinite programming (SDP) relaxations for solving an instance of the above synchronization problem were originally introduced and analyzed by Singer [8] in the context of angular synchronization, over the group SO(2) of planar rotations, where one is asked to estimate n unknown angles
θ1,,θn[0,2π),(5)
View Source
given m noisy measurements Θij of their offsets
Θij=θiθjmod2π.(6)
View Source
The difficulty of the problem is amplified on one hand by the amount of noise in the offset measurements, and on the other hand by the fact that m(n2), i.e., only a very small subset of all possible pairwise offsets are measured. In general, one may consider other groups G (such as SO(d), O(d)) for which there are available noisy measurements gij of ratios between the group elements gij=gig1j,gi,gjG. The set E of pairs (i,j) for which a ratio of group elements is available can be realized as the edge set of a graph G=(V,E), |V|=n,|E|=m, with vertices corresponding to the group elements g1,,gn, and edges to the available pairwise measurements gij=gig1j.

In [8], Singer analyzed the following noise model, where each edge in the measurement graph G is present with probability p, and each available measurement is either correct with probably 1η or a random measurement with probability η. For such a noise model with outliers, the available measurement matrix Θ is given by the mixture

Θij=θiθjU(S1)0for a correct edge,for an incorrect edge,for a missing edge,w.p.p(1η)w.p.pηw.p.1p.(7)
View Source where U(S1) denotes the uniform distribution over the unit circle. Using tools from random matrix theory, in particular rank-1 deformations of large random matrices [17], Singer showed in [8] that for the complete graph (p=1), the spectral relaxation for angular synchronization given by (12) and summarized in the next Section 3.1, undergoes a phase transition phenomenon, with the top eigenvector of the Hermitian matrix H in (9) exhibiting above random correlations with the ground truth as soon as
1η>1n.(8)
View Source
In other words, even for very small values of 1η (thus a large noise level), the eigenvector synchronization method yields a solution which (prior to the normalization in ()), is shown to exhibit above random correlations with the ground truth angles if there are enough pairwise measurements available, i.e., whenever n(1η)2 is large enough. For the general case of Erdös-Rényi graphs the above threshold probability can be extended to O(1np). We remark that while there exist no theoretical guarantees on the exactness of the eigenvector method for angular synchronization, in practice this relaxation has been observed to be nearly tight for small enough levels of noise η. On the other hand, for the SDP relaxation of the Least Unsquared Deviation (LUD) formulation [18], there do exist exact recovery guarantees under the above Erdös-Rényi graph measurement model.

3.1 Spectral Relaxation

Following the approach introduced in [8], we build the n×n sparse Hermitian matrix H=(Hij) whose elements are either zero or points on the unit circle in the complex plane

Hij={eıθij0if(i,j)Eif(i,j)E.(9)
View Source In an attempt to preserve the angle offsets as best as possible, Singer considers the following maximization problem
maximizeθ1,,θn[0,2π)i,j=1neιθiHijeιθj(10)
View Source
which gets incremented by +1 whenever an assignment of angles θi and θj perfectly satisfies the given edge constraint Θij=θiθjmod2π (i.e., for a good edge), while the contribution of an incorrect assignment (i.e., of a bad edge) will be uniformly distributed on the unit circle in the complex plane. Note that (10) is equivalent to the formulation in (4) by exploiting properties of the Frobenius norm, and relying on the fact that it is possible to represent group elements for the special case of SO(2) as complex-valued numbers. Since the non-convex optimization problem in (10) is difficult to solve computationally [19], Singer introduced the following spectral relaxation
maximizez1,,znC;ni=1|zi|2=ni,j=1nzi¯Hijzj(11)
View Source
by replacing the individual constraints zi=eιθi having unit magnitude by the much weaker single constraint ni=1|zi|2=n. Next, we recognize the resulting maximization problem in (11) as the maximization of a quadratic form whose solution is given by the top eigenvector of the Hermitian matrix H, which has an orthonormal basis over Cn, with real eigenvalues λ1λ2λn and corresponding eigenvectors v1,v2,,vn. In other words, the spectral relaxation of the non-convex optimization problem in (10) is given by
maximize||z||2=nzHz(12)
View Source
which can be solved via a simple eigenvector computation, by setting z=v1 , where v1 is the top eigenvector of H, satisfying Hv1=λ1v1, with ||v1||2=n, corresponding to the largest eigenvalue λ1. Before extracting the final estimated angles, we consider the following normalization of H using the diagonal matrix D, whose diagonal elements are given by Dii=Nj=1|Hij|, and define
H=D1H,(13)
View Source
which is similar to the Hermitian matrix D1/2HD1/2
H=D1/2(D1/2HD1/2)D1/2.(14)
View Source
Thus, H has n real eigenvalues λH1>λH2λHn with corresponding n orthogonal (complex valued) eigenvectors vH1,,vHn, with HvHi=λHivHi . We define the estimated rotation angles θ^1,...,θ^n using the top eigenvector vH1 via
eιθ^i=vH1(i)|vH1(i)|,i=1,2,,n.(15)
View Source
Note that the estimation of the rotation angles θ1,,θn is up to an additive phase since eiαvH1 is also an eigenvector of H for any αR. We point out that the only difference between the above approach and the angular synchronization algorithm in [8] is the normalization (13) of the matrix prior to the computation of the top eigenvector, considered in our previous work [20], and formalized in [21] via the notion of graph Connection Laplacian L=DH.

3.2 Semidefinite Programming (SDP) Relaxation

A second relaxation proposed in [8] as an alternative to the spectral relaxation, is via the following SDP formulation. In an attempt to preserve the angle offsets as best as possible, one may consider the following maximization problem

i,j=1neιθiHijeιθj=trace(HΥ),(16)
View Source where Υ is the unknown n×n rank-1 Hermitian matrix
Υij=eι(θiθj)(17)
View Source
with ones in the diagonal Υii,i=1,2,,n. Note that, with the exception of the rank-1 constraint, all the remaining constraints are convex and altogether define the following SDP relaxation for the angular synchronization problem
maximizeΥCn×nsubject totrace(HΥ)Υii=1Υ0,i=1,,n(18)
View Source
which can be solved via standard methods from the convex optimization literature [22] . We remark that, from a computational perspective, solving such SDP problems is computationally feasible only for relative small-sized problem (typically with several thousand unknowns, up to about n=10,000), though there exist distributed methods for solving such convex optimization problems, such as the popular Alternating Direction Method of Multipliers (ADMM) [23] which can handle large-scale problems. Alternatively, block coordinate descent methods have been shown to be extremely efficient at solving a broad class of semidefinite programs, including Max-Cut SDP relaxation and minimum nuclear norm matrix completion [24]. Finally, the very recent work of Boumal [25] also addresses the issue of solving large-scale SDP problems via a Riemannian optimization approach.

As pointed out in [8], this program is very similar to the well-known Goemans-Williamson SDP relaxation for the famous MAX-CUT problem of finding the maximum cut in a weighted graph, the only difference being the fact that here we optimize over the cone of complex-valued Hermitian positive semidefinite matrices, not just real symmetric matrices. Since the recovered solution is not necessarily of rank-1, the estimator is obtained from the best rank-1 approximation of the optimal solution matrix Θ via a Cholesky decomposition. We plot in Fig. 3 the recovered ranks of the SDP relaxation for the ranking problem, and point out the interesting phenomenon that, even for noisy data, under favorable noise regimes, the SDP program still returns a rank-1 solution. The tightness of this relaxation has been explained only recently in the work of Bandeira et al. [14]. The advantage of the SDP relaxation is that it explicitly imposes the unit magnitude constraint for eιθi, which we cannot otherwise enforce in the spectral relaxation.

SECTION 4

Sync-Rank: Ranking via Synchronization

We now consider the application of the angular synchronization framework [8] to the ranking problem. The underlying idea has also been considered by Yu in the context of image denoising [9], who suggested, similar to [8], to perform the denoising step in the angular embedding space as opposed to the linear space, and observed increased robustness against sparse outliers in the measurements.

Denote the true ranking of the n players by r1<r2<<rn, and assume without loss of generality that ri=i, i.e., the rank of the ith player is i. In the ideal case, the ranks can be imagined to lie on a one-dimensional line, sorted from 1 to n, with the pairwise rank comparisons given, in the noiseless case, by Cij=rirj (for cardinal measurements) or Cij=sign(rirj) (for ordinal measurements). In the angular embedding space, we consider the ranks of the players mapped to the unit circle, say fixing r1 to have a zero angle with the x-axis, and the last player rn corresponding to an angle equal to π. In other words, we imagine the n players wrapped around a fraction of the circle, interpret the available rank-offset measurements as angle-offsets in the angular space, and thus arrive at the setup of the angular synchronization problem from Section 3.

We also remark that the modulus used to wrap the players around the circle plays an important role in the recovery process. If we choose to map the players across the entire circle, this would cause ambiguity at the end points, since the very highly ranked players will be positioned very close (or perhaps even mixed) with the very poorly ranked players. To avoid the confusion, we simply choose to map3 the n players to the upper half of the unit circle [0,π].

In the ideal setting, the angles obtained via synchronization would be as shown in the left plot of Fig. 1, from where one can easily infer the ranking of the players by traversing the upper half circle in an anti-clockwise direction. However, since the solution to the angular synchronization problem is computed up to a global shift (see for example the top right and the bottom left plots in Fig. 1), an additional post-processing step is required to accurately extract the ordering of the players that best matches the given data. To this end, we simply compute the best circular permutation of the initial rankings obtained from synchronization, that minimizes the number of upsets in the given data. We illustrate this step with a noisy instance of Sync-Rank in the bottom of Fig. 1, where the left plot shows the rankings induced by the initial angles recovered from synchronization, while the right one shows the final ranking, after shifting by the best circular permutation.

Fig. 1.

Top: equidistant mapping of the rankings 1,2,,n in the upper half circle, for n=100, where the rank of the ith player is i (left), and the recovered solution at some random rotation (right). Bottom: the ranking induced by the θ angles from the ASP (left), and the ranking obtained after finding the optimal circular permutation (right).

Denote by s=[s1,s2,,sn] the ordering induced by the angles recovered from angular synchronization, when sorting the angles from smallest to largest, where si denotes the label of the player ranked on the ith position. For example, s1 denotes the player with the smallest corresponding angle θx1. To measure the accuracy of each candidate circular permutation σ, we first compute the pairwise rank offsets associated to the induced ranking, via

Pσ(s)=(σ(s)11σ(s))A,(19)
View Source where denotes the outer product of two vectors xy=xyT, denotes the Hadamard product of two matrices (entrywise product), and A is the adjacency matrix of the graph G . In other words, for an edge (u,v)E(G), it holds that (Pσ(s))uv=σ(s)uσ(s)v, i.e., the resulting rank offset after applying the cyclic shift.4 Next, we choose the circular permutation minimizing the l1 norm5 of the residual matrix
σ=arg  minσ1,,σn12sign(Pσi(s))sign(C)1(20)
View Source
which counts the total number of upsets. Note that an alternative to the above error is given by
σ=arg  minσ1,,σnPσi(s)C1(21)
View Source
which takes into account the actual magnitudes of the offsets, not just their sign. We summarize the above steps of Sync-Rank in Algorithm 1, and note that throughout the paper, we denote by SYNC (or SYNC-EIG) the version of synchronization-based ranking which relies on the spectral relaxation of the synchronization problem, and by SYNC-SDP the one obtained via the SDP relaxation. Fig. 2 is a comparison of the rankings obtained by the different methods: SVD, LS, SER, SER-GLM, RC, and SYNC-EIG, for an instance of the ranking problem given by the Erdös-Rényi measurement graph G(n=100,p=0.5) with cardinal comparisons, and outliers chosen uniformly at random with probability η={0.35,0.75}, according to the ERO(n,p,η) noise model (29). At low levels of noise, all methods yield satisfactory results, but as the noise level increases, only Sync-Rank is able to recover a somewhat accurate solution, and significantly outperforms all of the other methods in terms of the number of flips (i.e., the Kendall distance) with respect to the original ranking.

Fig. 2.

Comparison of the methods for the ERO(n=100,p=0.5,η) noise model (29) with cardinal comparisons with η=0.35 (top), η=0.75 (bottom).

SECTION Algorithm 1.

Summary of the Synchronization-Ranking (Sync-Rank) Algorithm

Require: G=(V,E) the graph of pairwise comparisons and C the n×n matrix of pairwise comparisons (rank offsets), such that whenever (i,j)E(G) we have available a (perhaps noisy) comparison between players i and j, either a cardinal comparison (Cij[(n1),(n1)]) or an ordinal comparison Cij=±1.

Map all rank offsets Cij to an angle Θij[0,2πδ) with δ[0,1), using the transformation

CijΘij:=2πδCijn1.(22)
View Source

We choose δ=12, and hence Θij:=πCijn1.

Build the n×n Hermitian matrix H with Hij=eıθij,if(i,j)E, and Hij=0 otherwise, as in (9).

Solve the angular synchronization problem via either its spectral (12) or SDP (17) relaxation, and denote the recovered solution by r^i=eıθ^i=vR1(i)|vR1(i)|,i=1,2,,n, where v1 denotes the recovered eigenvector

Extract the corresponding set of angles θ^1,,θ^n[0,2π) from r^1,,r^n.

Order the set of angles θ^1,,θ^n in increasing order, and denote the induced ordering by s=s1,,sn.

Compute the best circular permutation σ of the above ordering s that minimizes the resulting number of upsets with respect to the initial rank comparisons given by C

σ=arg  minσ1,,σn||sign(Pσi(s))sign(C)||1(23)
View Source

with P defined as in (18).

Output as a final solution the ranking induced by the circular permutation σ.

From a computational perspective, we point out that in most practical applications where the matrix of pairwise measurements is sparse, the eigenvector computation on which Sync-Rank is based can be solved in almost linear time. Every iteration of the power method is linear in the number of edges in the graph G (i.e., the number of pairwise measurements), but the number of iterations is greater than O(1), as it depends on the spectral gap.

As a final remark, we point out that existing theoretical guarantees on the angular synchronization problem [8], [14], [26] imply lower bounds on the largest amount of noise permissible in the pairwise rank offsets while still achieving exact recovery of the ground truth ranking. This is due to the fact that perfect recovery of the angles in the angular synchronization problem is not necessary for a perfect recovery of the underlying true ranking, it only suffices that the relative ordering of the angles is preserved. However, the numerical results shown in the heat map (c) of Fig. 4, corresponding to the outliers noise model on the pairwise cardinal rankings (as given by (29)), suggests that the threshold probability in the case of ranking is very similar to its analogue from angular synchronization obtained via the spectral relaxation, i.e., O(1np) obtained by Singer [8] and illustrated by the overlayed black curve. In other words, as soon as angular synchronization fails, it does so by not only estimating incorrectly the magnitude of each and every angle, but also fails in preserving the relative ordering of the angles, which is precisely what Sync-Rank relies on.

4.1 Synchronization Ranking for Ordinal Comparisons

When the pairwise comparisons are ordinal, and thus Cij=±1,ijE, all the angle offsets in the synchronization problem will have constant magnitude, which is perhaps undesirable. To this end, we report on a scoring method that associates a magnitude to each ±1 entry, and gives a more accurate description of the rank-offset between a pair of players, which ultimately constitutes the input to Sync-Rank. For an ordered pair (i,j), we define the Superiority Score of i with respect to j in a similar manner as the Smatchij score (41) used by the Serial-Rank algorithm. Let

Wij=L(i)H(j)={k|Cik=1andCjk=1},(24)
View Source where L(x) denotes the set of nodes with rank lower than x, and H(x) denotes the set of nodes with rank higher than x. The rank offset used as input for Sync-Rank is given by
Sij=WjiWij.(25)
View Source
The philosophy behind this measure is as follows. In the end, we want Sij to reflect the true rank-offset between two nodes. One can think of Wij as the number of witnesses favorable to i ( supporters of i), which are players ranked lower than i but higher than j. Similarly, Wji is the number of witnesses favorable to j ( supporters of j), which are players ranked lower than j but higher than i. The final rank offset is given by the difference of the two values Wij and Wji (though one could also perhaps consider the maximum), and Sij which can be interpreted as a proxy for rirj. We solve the resulting synchronization problem via the eigenvector method, and denote the approach by SYNC-SUP. While this approach yields rather poor results (compared to the initial synchronization method) on the synthetic data sets, its performance is comparable to that of the other methods when applied to the small sized Premier League soccer data set. However, it performs remarkably well on the NCAA basketball data set, and achieves a number of upsets that is twice as low as that of the second best method, as shown in Fig. 8 of Section 5.4. One could investigate whether there is a connection between the rather intriguing performance of SYNC-SUP and the observation that for this particular data set, the measurement graphs for each season are somewhat close to having a one dimensional structure, with teams grouped in leagues according to their strengths, and it is less likely for a highly ranked team to play a match against a very weak team. It would be interesting to investigate the performance of all algorithms on a measurement graph that is a disc graph with nodes lying on a one-dimensional line corresponding to the ground truth ranks, with an edge between two nodes if and only if they are at most r units apart.

SECTION 5

Noise Models and Experimental Results

We compare the performance of Sync-Rank with that of other methods across a variety of measurement graphs, varying parameters such as the number of nodes, edge density, level of noise and noise models. We detail below the two noise models we consider, and then proceed with the performance outcomes of extensive numerical simulations.

5.1 Measurement and Noise Models

In most real world scenarios, noise is inevitable and imposes additional significant challenges, aside from those due to sparsity of the measurement graph. To each player i (corresponding to a node of G), we associate a corresponding unique positive integer weight ri. For simplicity, one may choose to think of ri as taking values in {1,2,,n}, and assume there is an underlying ground truth ranking of the n players, with the most skillful player having rank 1, and the least skillful one having rank n. We denote by

Cardinalmeasurement:Oij=rirj(26)
View Source the ground truth cardinal rank-offset of a pair of players. Thus, for cardinal comparisons, the available measurements Cij are noisy versions of the ground truth Oij entries. On the other hand, an ordinal measurement is a pairwise comparison between two players which reveals only who the higher-ranked player is, or in the case of items, the preference relationship between the two items, i.e., Cij=1 if item j is preferred to item i, and Cij=1 otherwise
Ordinalmeasurement:Oij=sign(rirj)(27)
View Source
without revealing the intensity of the preference relationship. This setup is commonly encountered in classical social choice theory, under the name of Kemeny model, where Oij takes value 1 if player j is ranked higher than player i, and 1 otherwise. Given (perhaps noisy) versions of the pairwise cardinal or ordinal measurements Oij given by (25) or (26) , the goal is to recover an ordering (ranking) of all players that is as consistent as possible with the given data. We compare our proposed ranking methods with those summarized in Section 2 , on measurement graphs of varying edge density, under two different noise models, and for both ordinal and cardinal comparisons.

To test for robustness against incomplete measurements, we use a measurement graph G of size n given by the popular Erdös-Rényi model G(n,p), where edges between the players are present independently with probability p. We consider two different values of p={0.2,1}, the latter case corresponding to the complete graph Kn on n nodes, when comparisons between all possible pairs of players are available. To test the robustness against noise, we consider the following two noise models detailed below. We remark that, for both models, noise is added such that the resulting measurement matrix remains skew-symmetric.

5.1.1 Multiplicative Uniform Noise (MUN) Model

In the Multiplicative Uniform Noise model, which we denote by MUN(n,p,η), noise is multiplicative and uniform, meaning that, for cardinal measurements, instead of the true rank-offset measurement Oij=rirj, we actually measure

Cij=Oij(1+ϵ),whereϵ[η,η].(28)
View Source An equivalent formulation is that to each true rank-offset measurement we add random noise ϵij (depending on Oij) uniformly distributed in [ηOij,ηOij], i.e.,
Cij={rirj+ϵij0for an existing edgefor a missing edgew. p.pw. p.1p(29)
View Source
with ϵijU([η(rirj),η(rirj)]), where U denotes the discrete uniform distribution. Note that we cap the erroneous measurements at n1 in absolute value. Thus, whenever Cij>n1 we set it to n1, and whenever Cij<(n1) we set it to (n1), since the furthest away two players can be is n1 positions. The percentage noise added is 100η, (e.g., η=0.1 corresponds to 10 percent noise). Thus, if η=50%, and the ground truth rank offset Oij=rirj=10, then the available measurement Cij is a random number in [5,15].

5.1.2 Erdös-Rényi Outliers (ERO) Model

The second noise model we consider is an Erdös-Rényi Outliers model, abbreviated by ERO(n,p,η), where the available measurements are given by the following mixture

Cij=rirjU [(n1),n1]0w.p.(1η)pw.p.ηpw.p.1p.(30)
View Source Note that this noise model is similar to the one considered by Singer [8], in which angle offsets are either perfectly correct with probability 1η , and randomly distributed on the unit circle, with probability η . We expect that in most practical applications, the first noise model (MUN) is perhaps more realistic, where most rank offsets are being perturbed by noise (to a smaller or larger extent, proportionally to their magnitude), as opposed to having a combination of perfectly correct rank-offsets and randomly chosen rank-offsets.

TABLE 1 Names of the Algorithms, Their Acronyms, and Respective Sections

5.2 Numerical Comparison on Synthetic Data Sets

This section compares the spectral and SDP relaxations of Sync-Rank with the other methods summarized in Section 2, and also in Table 1. We measure the accuracy of the recovered solutions, using the popular Kendall distance, i.e., the fraction of (ordered) pairs of candidates that are ranked in different order (flips), in the two permutations, the original one and the recovered one. We compute the Kendall distance on a logarithmic scale (log2), and remark that in the error curves, for small noise levels, the error levels for certain methods are missing due to being equal to 0.

The heatmaps in Fig. 4 compute the recovery error, as given by the Kendall distance between the ground truth and the rankings obtained via the spectral relaxation of Sync-Rank, as we vary the sparsity p in the measurement graph and the noise level in the data, for both cardinal and ordinal measurements, and under both the MUN (28) and ERO (29) noise models. For the ERO model, we overlay on the heatmap the curve given by the threshold probability for recovery O (1np) by Singer [8], and note that in the cardinal case, it matches very well the empirical performance.

In Fig. 5 we compare the methods for the case of cardinal pairwise measurements, while in Fig. 6 we do so for ordinal measurements. We average all the results over 20 experiments. The SYNC and SYNC-SDP methods are far superior to the other methods in the case of cardinal measurements, and similar to the other methods for ordinal measurements. In Fig. 3 we plot the recovered rank of the SDP program (17). Note that for favorable levels of noise, the SDP solution is indeed of rank 1 even if we did not specifically enforce this constraint, a phenomenon explained only recently in the work of Bandeira et al. [14], who investigated the tightness of this SDP relaxation.

Fig. 3.

The rank of the recovered solution from the SDP (17), for ordinal ranking, as we vary the noise level, for both the MUN (n=200,p,η ) and ERO(n=200,p,η) noise models. We average the results over 20 runs. Note that if the SDP relaxation admits a admits a rank-one solution, then the relaxation is tight and the synchronization problem was solved optimally.

5.3 Numerical Comparison on the English Premier League Data Set

The first real data set we consider is from sports data, in particular, several years of match outcomes from the English Premier League Soccer. We consider the three complete seasons 2011-2014, both home ( Chome) and away (Caway) games. We pre-process the data and consider four methods to extract information from the game outcomes and build the pairwise comparison matrix C

  1. Total-Point-Difference Ctpdij=Chomeij+Cawayij: for each pair of teams, we aggregate the total score of the two matches between teams i and j

  2. Sign-Total-Point-Difference Cstpdij=sign(Ctpdij)=sign(Chomeij+Cawayij): considers the winner after aggregating the above score

  3. Net Wins Cnwij=sign(Chomeij)+sign(Cawayij): the number of times a team has beaten the other one.

  4. Sign(Net Wins) Csnwij=sign(Cnwij) : only considers the winner in terms of the number of victories.

Note that Ctpd and Cnw lead to cardinal measurements, while Cstpd and Csnw to ordinal ones. One may interpret the above variations of the pre-processing step as a different winning criterion for the game under consideration, and at the same time, as a different opportunity to test and compare the algorithms. We remind the reader that for soccer games, a win is compensated with 3 points, a tie with 1 point, and a loss with 0 points, and the final ranking of the teams is determined by the cumulative points a team gathers throughout the season. Given a particular criterion, we are interested in finding an ordering that minimizes the number of upsets. We denote by r^i the estimated rank of player i as computed by the method of choice. Recall that lower values of r^i correspond to higher ranks (better players or preferred items). We then construct the induced (possibly incomplete) matrix of induced pairwise rank-offsets C^ij=r^ir^j,if(i,j)E(G), and 0 otherwise, and remark that C^ij<0 denotes that the rank of player i is higher than the rank of player j. To measure the accuracy of a proposed reconstruction, we rely on the popular metric

Q(u)=i=1n1j=i+1n1{sign(CijC^ij)=1}(31)
View Source which counts the number of disagreeing ordered comparisons. It contributes with a +1 to the summation whenever the ordering in the provided data contradicts the ordering in the induced ranking.

We show in Table 2 the rankings obtained by the different methods we have experimented with, for the 2013-2014 Premier League Season, when the input is based on the Cnw measurements (Net Wins). The final column, denoted by GT, shows the final official ranking at the end of the season. We sort the teams alphabetically, and show their assigned rank by each of the methods. The very last row in the Table computes the Kendall correlation between the respective ranking and the official final standing GT.

TABLE 2 English Premier League Standings 2013-2014, Based on the Input Matrix Cnw (which Counts the Number of Net Wins between a Pair of Teams)

We remark that, across the different type of inputs considered, LS, SYNC and SYNC-SDP correlate best with the official ranking. In terms of the number of upsets, the synchronization based methods alternate the first place in terms of quality with SER, depending on the pre-processing step and the input used for the ranking procedure. In addition, we show in Fig. 7 the Q-scores associated to three recent seasons in Premier League: 2011-2012, 2012-2013, and 2013-2014, together with the mean across these three seasons. We show the number of upsets in Fig. 7 and note that for

  • Cnw: SYNC-SDP is the best performer, while the second place is tied by SYNC, SYNC-SUP and RC

  • Csnw: LS, SYNC and SYNC-SDP are all tied for the first place

  • Ctpd: SYNC-SUP is by far the best performer, followed by SER

  • Cstpd: LS, SYNC and SYNC-SDP are all tied for the first place, followed by SER and SER-GLM on the second place.

Overall, we can conclude that, already for a very small example of size only n=20, the synchronization-based methods show superior performance especially in the case of cardinal inputs Cnw and Ctpd.

5.4 Numerical Comparison on the NCAA College Basketball Data Set

Our second real data set contains the outcomes of NCAA College Basketball matches during the regular season, for each of the past 30 seasons over the interval 1985-2014. During the regular season, most often it is the case that a pair of teams play against each other at most once. For example, during the 2014 season, for which 351 teams competed, there were a total of 5,362 games, with each team playing on average about 30 games. We remark that in the earlier years, there were significantly fewer teams participating in the season, and therefore games played, which also explains the increasing number of upsets, given by (30), for the more recent years, as shown in Fig. 8a. For example, during the 1985 season, only 282 teams participated, playing a total of 3,737 games. In Fig. 8 a we compare the performance of all methods across the years, for the case of both cardinal and ordinal measurements, i.e., when we consider the point difference or simply the winner of the game. Similarly, in Fig. 8b we compute the average number of upsets across all years in the interval 1985-2014, for both cardinal and ordinal measurements. Note that, for the less frequent cases when two teams play against each other more than once, we simply average out the available scores. We remark that the SYNC-SUP method, i.e., eigenvector-based synchronization based on the superiority score from Section 4.1, performs remarkably well and achieves a number of upsets which is half that of the next best performing methods. The second best results are obtained (in no particular order) by LS, SYNC, SYNC-SDP and RC, which all yield very similar results, while SVD and SER are clearly the worst performers. We left out from the simulations the SER-GLM method due to its O(n3) computational running time.

SECTION 6

Rank Aggregation

In many practical instances, it is often the case that multiple voters or rating systems provide incomplete and inconsistent rankings or pairwise comparisons between the same set of players or items (e.g., at a contest, two players compete against each other, and k judges decide what the score should be). In such settings, a natural question to ask is, given a list of k (possibly incomplete) matrices of size n×n corresponding to ordinal or cardinal rank comparisons on a set of n players, how should one aggregate all the available data and produce a single global ranking, that is as consistent as possible with the observed data? The difficulty of the problem is amplified on one hand by the sparsity of the measurements, since each rating system only provides a comparison for a small set of pairs of items, and on the other hand by the amount of inconsistent information in the provided individual preferences. For example, given a set of 3 items A, B, C, the first judge may prefer A to B, the second one B to C, and the third one C to A.

Assume there are k judges, each of which makes available a noisy incomplete matrix C(i),i=1,,k on the pairwise comparisons between a (perhaps randomly chosen) set of pairs of players. In practice, it is not necessarily the case that the same set of players appears in each comparison matrix C(i); however, for simplicity, we do assume this is the case. Furthermore, the pattern of pairwise existing and missing comparisons may be, and usually is, different across the above sequence of comparison matrices. We denote by Θ(i),i=1,,k the resulting matrices of angle offsets after the mapping (21), and by H(i),i=1,,k, the corresponding matrices after the mapping (9).

Tentative approach A. One possible naive approach to this problem would be to solve in parallel, via a method of choice, each of the C(i),i=1,,k matrices, thus producing k possible rankings, one for each rating system, average the resulting rankings and consider the induced ordering by each method. In Fig. 9 we denote by SVD-PAR, LS-PAR, SER-PAR, SYNC-PAR, SER-GLM-PAR, SYNC-SDP-PAR this approach for rank aggregation which runs, individually on each matrix C(i),i=1,,k the various methods considered so far, and finally averages6 out the obtained rankings across all rankings proposed by a given method.

Tentative approach B. Another approach would be to first average out all the available matrices C(i),i=1,,k into C¯=C(1)+C(2)++C(k)k, and extract a final ranking by whatever preferred method that takes as input matrix C¯ , which we denote by Θ¯ after applying the transformation (21). In Fig. 9 we denote by SVD-AVG, LS-AVG, SER-AVG, SYNC-AVG, SER-GLM-AVG, SYNC-SDP-AVG the approach for rank aggregation which runs the various methods on C¯.

6.1 Rank Aggregation via Synchronization

A less naive approach to the above rank aggregation problem would be to consider the block diagonal matrix of size N×N, with N=nk, given by CN×N=blkdiag(C(1),C(2),,C(k)), and its counterpart, the Hermitian matrix HN×N obtained after mapping to the circle. Denote by Ai,i=1,,n the set of indices corresponding to the same player i in C. For simplicity, assume A1={1,n+1,2n+1,,(k1)n+1}, and in general, Ai={i,n+i,2n+i,,(k1)n+i} . As a first step, one could consider the SDP relaxation (17) which allows us to easily incorporate the hard constraints Υij=1 if i,jAu,u=1,,k, and add a post-processing step to ensure that the resulting top eigenvector vΥ1 is piecewise constant along the sets of indices Au that represent the same players (since after the rank-1 projection of the solution Υ, the imposed hard constraints are no longer exactly satisfied). However, the size of the resulting SDP is prohibitive nk×nk, and one may wonder if is possible to reduce its size to n×n, since there is a lot of redundant information in the formulation, with many subsets of the rows being equal, since they correspond to the same player. For computational reasons, it is also desirable to solve the rank aggregation problem via the eigenvector method. Indeed this is the case, and this is based on the observation that the objective function for the rank-aggregation problem can be written as

ijE(G)eιθi(u=1keιΘ(u)ij)eιθj=ijE(G)eιθiH¯ijeιθj,(32)
View Source where H¯ is given by the sum of the k Hermitian matrices H¯=ku=1H(u). The gist of this approach, leading to a much smaller-sized problem than the above SDP of size nk×nk, is that we first average out (in the complex plane) all measurements corresponding to the same pair of players into a single n×n matrix H¯, and then proceed with solving the ranking problem via the spectral and SDP relaxations of Sync-Rank, denoted by EIG-AGG, respectively SDP-AGG.

6.2 Numerical Results for Rank Aggregation

In this section, we provide numerical experiments comparing the various methods that solve the rank aggregation problem. Note that for ordinal measurements, we also compare to the original version of the Rank-Centrality algorithm, considered in [13] in the case of multiple rating systems and applicable only to ordinal measurements, whose main steps are summarized in (45), (46) and (47). We denote this approach by RCO in Fig. 9. Furthermore, for both cardinal and ordinal measurements, we also compare to our proposed versions of Rank-Centrality, discussed in Sections C.1 and C.2 for the case of a single rating system k=1. We adjust these two methods to the setting of multiple rating systems k>1, by simply averaging out the winning probabilities in (50) or (51) of each rating system, as in (48).

In the top of Fig. 9, we compare the methods in the setting of the Multiplicative Uniform Noise MUN(n=100,p,η), and average the results over 10 experiments. For cardinal data, we note the SDP-AGG method yields significantly more accurate results that the rest of the methods, followed by EIG-AGG, and the naive aggregation methods SDP-AVG and SDP-PAR. In the case of the complete graph, the Serial-Rank method SER-GLM comes in second best, together with SYNC-SDP-PAR. For ordinal data, the four methods EIG-AGG, SDP-AGG, SYNC-SDP-PAR and SYNC-SDPAVG yield very similar results, and are more accurate than the rest of the methods, especially for sparse graphs.

For the Erdös-Rényi Outliers model ERO(n=100,p,η ) with cardinal measurements, illustrated in Fig. 9 EIG-AGG, SDP-AGG are the best performers, followed very closely by SYNC-PAR and SYNC-SDP-PAR. For the case of the complete graph, the SER-GLM comes in next in terms of performance, while all the remaining methods show a rather poor relative performance. The relative performance of all methods is completely different for the case of ordinal measurents, under the ERO model, where SER-AVG and SER-GLM-AVG are the best performers, especially at lower levels of noise. Furthermore, the gap between these two methods and all the remaining ones increases as the measurement graphs become denser. For the case of the complete graph with ordinal comparisons with outliers, SER-AVG and SER-GLM-AVG produce results that are 2-3 orders of magnitude more accurate compared to all other methods, at lower levels of noise η0.2. As a final observation, we point out that, in the case of outliers given by the ERO model, our proposed version of Rank-Centrality, denoted by RC, introduced in Section C.1 and given by (50), performs far better than the original RCO method in [13].

SECTION 7

Ranking with Hard Constraints via Semidefinite Programming

Next, we consider a semi-supervised ranking problem, and illustrate how one can enforce certain types of constraints. Suppose that the user has readily available information on the true rank of a subset of players, and would like to obtain a global ranking solution which obeys the imposed ranking constraints. To this end, we propose a modified Sync-Rank algorithm based on the SDP relaxation of the ASP, followed by a post-processing step. Ranking with vertex constraints, i.e., information on the true rank of a small subset of players, is the ASP with constraints, similar to synchronization over Z2 with constraints considered in our previous work [27]. We refer to the subset of players whose rank is already known as anchors and denote this set by A, while the set of non-anchor players (referred to as free players) shall be denoted by F. In the ASP, anchors are nodes whose corresponding group element in SO(2) is known a-priori. Given our measurement graph G=(V,E) with node set V corresponding to a set of n group elements composed of h anchors A={a1,,ah} with ai SO(2), and l non-anchor elements F={f1,,fl} with fi SO(2), with n=h+l, and edge set E of size m corresponding to an incomplete set of m pairwise angle offsets of the type fifj or aifj, the goal is to estimate f^1,,f^l SO(2).

Since the ground truth rank of anchor players is known, we compute the ground truth rank offset for all pairs of anchor nodes, which we then map to the circle via (21), and enforce the resulting offsets as hard constraints in the SDP relaxation (17) , thus leading to the program in (33), where the maximization is taken over all matrices Υ0 with

Υij=eι(fifj)eι(fiaj)eι(aiaj)ifi,jFifiF,jAifi,jA(33)
View Source with ones on the diagonal Υii=1,i=1,,n
maximizeΥCn×nsubject toTrace(HΥ)Υii=1,i=1,,nΥij=eι(aiaj),ifi,jAΥ0.(34)
View Source
While in the noiseless case, the SDP in (33) may return a rank-1 one solution, for noisy data the rank could by larger than 1, and we consider the best rank-1 approximation. We use the top eigenvector v(Υ)1 of Υ to recover the estimated angles and the induced rankings, but only up to a global rotation, i.e., a circular permutation. With this observation in mind, we propose a post-processing step, depicted in Fig. 10, that ensures that the anchor players obey their prescribed rank. We keep the ranks of the anchor players fixed, and sequentially apply a cyclic permutation of the free players on the available rankings, searching for the optimal cyclic shift that minimizes the number of upsets (19). To measure the accuracy of each candidate ranking s (i.e., a permutation of size n), we first compute the associated pairwise rank offsets Z(s)=s11s. Next, we choose the circular permutation σ(i)F of the ranks of the set of free players F, which minimizes the l1 norm of the following residual matrix that counts the total number of upsets 12sign(Z(σAσ(i)F)sign(C)1, where σA denotes the permutation associated to the anchor players, which stays fixed in the above minimization. The intuition is that the SDP solution preserves as best as possible the relative ordering of all players, in particular of the free players, and does so up to a global shift, which we recover via the above procedure. Fig. 11 shows the outcome of numerical simulations of Sync-Rank for several instances of the constrained ranking problem, under the MUN model with cardinal constraints. As expected, the numerical error decreases as more anchor information becomes available.

SECTION 8

Future Research Directions

We highlight in this section several possible future research directions, which we expand in more detail in Appendix D. As seen in Section 7, adding pairwise rank-offsets as hard constraints improves the end results. A natural question to consider is how one may incorporate similar soft constraints, captured in a second matrix Q. As in (12), one can maximize the same quadratic form zHz, subject to the given constraints being satisfied well enough, i.e., zQzα, and also zz=n, which altogether lead to a generalized eigenvalue problem. However, note that one could further replace H and Q with their respective graph Connection Laplacians, which are both PSD matrices, thus rendering the problem solvable in nearly-linear time due to recent progress in the area of Laplacian linear systems solvers, as explored in our recent work [28] on the well-studied constrained clustering problem. Another interesting variation concerns the scenario in which one wishes to enforce ordinal hard constraints that player i is ranked higher than player j, i.e., r^i>r^j. Another question is whether an l1-based formulation of the ASP could further improve the robustness to noise or sparsity of the measurement graph, since the existing spectral and SDP relaxations for the ASP [8] approximate a least squares solution in the angular space. One other very interesting direction concerns the possibility of casting the ranking problem as a synchronization problem over the group SO(3), as opposed to SO(2) as in present work. The extra dimension would facilitate the extraction of partial rankings, which one can collect by traversing along longitudinal lines and their immediate neighborhoods, and perhaps further increase the overall robustness to noise. Yet another future direction would be to aim for tighter guarantees in terms of robustness to noise. Finally, we have also considered the problem of planted partial rankings, which arises in the setting when a small subset Λ of players have their pairwise measurements more accurate than the rest of the network. We propose in Appendix A a spectral algorithm based on the Laplacian of the residual rank-offset matrix given by Sync-Rank, which, unlike other methods, is able to preserve the planted partial ranking and recover Λ, as shown in Fig. 12.

SECTION 9

Summary and Discussion

We considered the problem of ranking with noisy incomplete information and made an explicit connection with the angular synchronization problem, for which spectral and SDP relaxations already exist in the literature with provable guarantees. This approach leads to a computationally efficient (as is the case for the spectral relaxation), non-iterative algorithm that is model independent and relies exclusively on the available data. We provided extensive numerical simulations, on both synthetic and real data sets that show that our proposed procedure compares favorably with state-of-the-art methods from the literature, across a variety of measurement graphs and noise models.

We took advantage of the spectral, and especially SDP, relaxations and proposed methods for ranking in the semi-supervised setting where a subset of the players have a prescribed rank to be enforced as a hard constraint. In addition, we considered the popular rank aggregation problem of combining ranking information from multiple rating systems that provide independent incomplete and inconsistent pairwise measurements on the same set of items, with the goal of producing a single global ranking.

ACKNOWLEDGMENTS

The author would like to thank the Simons Institute for their warm hospitality and fellowship support during his stay in Berkeley throughout Fall 2014, when most of this research was carried out. He is very grateful to Amit Singer for suggesting the possibility of applying angular synchronization to the ranking problem. He would like to thank Andrea Bertozzi for her support via AFOSR MURI grant FA9550-10-1-0569, Alex d’Aspremont and Fajwel Fogel for useful discussions on Serial-Rank and relevant data sets, and Xiuyuan Cheng for discussions on random matrix theory. Finally, he would also like to thank the anonymous reviewers for their thorough comments and suggestions.

Appendix A Recovering Planted Partial Rankings

Aside from its increased robustness to noise, we point out another appealing feature of the Sync-Rank algorithm. In many real-life scenarios, one is not necessarily interested in recovering the entire ranking, but only the ranking of a subset of the players. Often, the noise in the available measurements may not be distributed uniformly throughout the network, with part of the network containing pairwise comparisons that are a lot less noisy than those in the rest of the network. This asymmetry provides us with an opportunity and raises the question of whether one can recover partial rankings which are locally consistent with the given data. In this Appendix we demonstrate empirically that Sync-Rank is able to preserve such partial rankings. In addition, we extract such hidden locally consistent partial rankings, relying on a spectral algorithm in the spirit of existing algorithms for the planted clique and densest subgraph problems.

We remark that this approach can be extended to extracting multiple partial rankings by relying on recent work of Lee et al. on multi-way spectral partitioning and higher-order Cheeger inequalities [29]. Note that in the case of k partial rankings, one does not necessarily seek a partition of the residual graph into k clusters, but rather seeks k disjoint clusters, whose union is not necessarily the entire graph. The Cheeger Sweep final step of their algorithm allows one to extract k such sets of smallest expansion. For simplicity, we only consider the case of a single planted locally consistent partial ranking. We denote by Λ the set of players whose pairwise ranking measurements contain less noise than the rest of the available data, and we shall refer to such players as Λ-players, and to the remaining ones as ΛC-players.

We evaluate our proposed procedure on graphs generated from the following ensemble G(n,p,β,η1,η2), where the measurement graph G is an Erdös-Rényi random graph G(n,p), |Λ|=βn, and the noise model is given by the following mixture

Cij=rirjU[(n1),n1]rirjU[(n1),n1]0{i,j}Λ,{i,j}Λ{i,j}Λ,{i,j}Λw.p.(1η1)αw.p.η1αw.p.(1η2)αw.p.η2αw.p.1α,(35)
View Source where U denotes the discrete uniform distribution.

The approach we propose for recovering such a locally consistent partial ranking starts as follows. For a given ranking method of choice (such as the ones we have considered throughout the paper), we estimate the complete ranking r^1,,r^n, and consider the resulting matrix of rank offsets

C^=(r^11r^)A,(36)
View Source where denotes the outer product of two vectors xy=xyT, denotes the Hadamard product of two matrices (entrywise product), and A is the adjacency matrix of the graph G ; in other words C^ij=r^ir^j,(i,j)E(G) . Next, we consider the residual matrix of pairwise rank offsets
R=|CC^|(37)
View Source
for the case of cardinal measurements, and
R=|Csign(C^)|(38)
View Source
for the case of ordinal measurements. We illustrate in Fig. 13 the residual matrices obtained by each of the methods, and remark that, for ease of visualization, the set Λ consists of the first βn=75 nodes, corresponding to the top left corner of each of the residual matrices shown in the heatmap. Whenever an estimated rank-offset C^ij (induced by the recovered ranking) matches very well with the initial measurement Cij, we expect the residual Rij to have small magnitude, and conversely, whenever the recovered offset C^ij is far from the initial measurement Cij then we expect Rij to have large magnitude. Furthermore, we show in Fig. 14, the recovered rankings by each of the methods, where we separate the rankings of the ΛC-players (shown in the top subplot, in blue), from those of the Λ-players (shown in the bottom subplot, in red). Note that the set Λ is not available to the user, and at this point we do not have yet an estimate Λ^ for Λ; however this plot already highlights the fact that synchronization-based ranking preserves almost perfectly the relative ranking of the Λ-players, while all the other methods fail to do so. Note that the title of each of the bottom plots of Fig. 14 shows the Jaccard Similarity index between Λ and Λ^ (which we show next how to estimate), and the Kendall distance between the ground truth rankings of the Λ^-players, and the estimated rankings of the Λ^-players.

In practice, when the measurements between pairs of Λ-nodes also contain noise, the corresponding sub-matrix has no longer zero or close to zero residuals, and it is harder to identify such a set of nodes. The resulting task at hand is to identify a subset of nodes for which the average inter-edge weights (the residuals shown in Fig. 13) is as small as possible. This problem is equivalent7 to the well-known densest subgraph problem, investigated in the theoretical computer science literature, for the case when the graph G is unweighted. Densest-k-Subgraph (DkS) on an undirected unweighted graph G concerns finding a subset of nodes UV(G) of size |U|=k with the maximum induced average degree. Using a reduction from the Max-Clique problem, the Densest-k-Subgraph can be shown to be NP-hard to solve exactly. Feige and Seltzer [30] showed that the Densest-k-Subgraph problem is NP-complete even when restricted to bipartite graphs of maximum degree 3 . If the parameter k is not known a-priori, the problem becomes the well known Densest Subgraph problem, where one seeks to find a subset U (regardless of its size) in order to maximize the average degree induced by U. The Densest Subgraph problem can be solved in polynomial time using either linear programming or flow techniques.

In a seminal paper [31], Alon et al. considered the graph ensemble G(n,1/2,k), where one starts with an Erdös-Rényi G(n,p=1/2) and randomly places a clique of size k in G. They proposed an efficient spectral algorithm that relies on the second eigenvector of the adjacency matrix of G, which almost surely finds the clique of size k, as long as k>cn. We remark that in our case the residual graph is weighted and thus a different procedure is required, though one can perhaps think of various heuristics for thresholding the entries of W, setting to 1 all remaining weights (thus rendering the graph unweighted), and perhaps further adding edges to the graph between r-hop neighbors away pairs of nodes, with the hope of producing a clique, which we can then detect using the algorithm proposed in [31] .

Algorithm 2. Algorithm for Recovering the Densest (Weighted) Subgraph of Size k of the Residual Graph whose Adjacency Matrix is Denoted by W

Require: A weighted graph H=(V,E), and its symmetric adjacency matrix W, and k<n

Compute the random-walk Laplacian L=D1W, where D is a diagonal matrix with Dii=nj=1Wij

Find the second eigenvector v2 of L, corresponding to the second largest eigenvalue λ2<λ1=1

Sort the entries of v2 in decreasing order, consider the top k largest entries in v2, let Λd denote the set of corresponding vertices, and compute the residual error associated to Λd

ERRΛd=CΛd,ΛdC^Λd,Λd1.(39)
View Source

Sort the entries of v2 in increasing order, consider the top k largest entries in v2, let Λi denote the set of corresponding vertices, and compute the residual error associated to Λi

ERRΛi=CΛi,ΛiC^Λi,Λi1.(40)
View Source

Output the final estimate Λ^ for Λ as

Λ^=arg  minΛd,Λi{ERRΛd,ERRΛi}.(41)
View Source

However, in this work, we take none of the above approaches, and rely on another spectral algorithm to uncover the planted dense subgraph, and hence the partial ranking. We use the following approach to uncover the planted densest subgraph of given size k. We first compute a similarity between the adjacent nodes of the graph using the Gaussian Kernel

Wij=eRijϵ2,(i,j)E(G)(42)
View Source for the parameter choice ϵ2=2n/10, and consider the random-walk Laplacian L=D1W, whose trivial eigenvalue-eigenvector pair is given by λ1=1 and the all-ones vector v1=1. To extract a cluster of given size k (our proposed estimate for the set Λ), we compute the first non-trivial eigenvector v2 of L, and consider its top k largest entries to produce an estimate Λ^ for the set Λ. We detail these steps in Algorithm 2 .

Fig. 15 shows a comparison of the methods (results averaged across 15 experiments) in terms of the Jaccard Similarity Index between the ground truth set Λ and our proposed estimate Λ^ , defined as

J(Λ,Λ^)=ΛΛ^ΛΛ^,(43)
View Source meaning that for values equal to 1 the algorithm would perfectly recover the planted nodes in Λ. Furthermore, we plot in Fig. 16 the corresponding Kendall distance between the recovered ranking of the Λ^-nodes (from Fig. 15) and their ground truth values, for the ensemble given by G(n=250,p,β,η1,η2=1), for varying p={0.2,0.5,1} and η1={0,0.2,0.4}. Note that across all experiments, the SYNC-SDP algorithm is by far the best performer, both in terms of the Jaccard Index (thus being able to recover the set Λ) and the Kendall distance (thus being able to recover accurately the ranking8 of the estimated Λ^-nodes). Note that an alternative approach to estimating the ranking of the Λ^ -nodes (once Λ^ has already been estimated) would be to recompute it by re-running the method of choice on the original measurement matrix C, restricted to the sub-matrix corresponding to nodes Λ^. We expect this approach to yield more accurate results, especially in the case when the measurements between a pair of (ΛC,ΛC)-nodes, or a pair of (Λ,ΛC)-nodes are very noisy (i.e., η2 is large in the ensemble G(n,p,β,η1,η2) defined by (34)), as was the case for the numerical experiments detailed here, for which we have chosen η2=1, meaning such measurements are pure noise and chosen uniformly at random from [(n1),(n1)] .

Appendix B Serial-Rank

A spectral algorithm that exactly solves the noiseless seriation problem (and the related continuous ones problem) was proposed by Atkins et al. [32], based on the observation that given similarity matrices computed from serial variables, the ordering induced by the second eigenvector of the associated Laplacian matrix (i.e., the Fiedler vector) matches that of the variables. In other words, this approach (reminiscent of spectral clustering) exactly reconstructs the correct ordering if the items in question lie on a one dimensional chain. In [12], Fogel et al. adapt the above seriation procedure to the ranking problem, and propose an efficient polynomial-time algorithm with provable recovery and robustness guarantees. Under certain conditions on the pattern of noisy entries, Serial-Rank is able to perfectly recover the underlying true ranking, even when a fraction of the comparisons are either corrupted by noise or completely missing. In this noisy setting, when the underlying measurement graph is dense, in other words, a high fraction of all the pairwise comparisons are observed, the spectral solution obtained by Serial-Rank is more robust to noise that other classical scoring-based methods.

The authors of [12] propose two approaches for computing a pairwise similarity matrix from both ordinal and cardinal comparisons between the players. In the case of ordinal measurements, the similarity measure counts the number of matching comparisons. More precisely, given as input a skew symmetric matrix C of size n×n of pairwise comparisons Cij={1,0,1}, with Cij=Cji, given by the following model

Cij=101if i is ranked higher than jif i and j are tied, or never competedif j is ranked higher than i.(44)
View Source For convenience, the diagonal of C is set to Cii=1,i=1,2,,n. Finally, the pairwise similarity matrix is given by
Smatchij=k=1n(1+CikCjk2).(45)
View Source
Note that CikCjk=1 whenever i and j have the same signs, and CikCjk=1 whenever they have opposite signs, thus Smatchij counts the number of matching comparisons between i and j with a third reference item k. The lack of a comparison of item k with either i or j contributes with a 12 to the summation in (41). The intuition behind this similarity measure proposed by Fogel et al. is that players that beat the same players and are beaten by the same players should have a similar ranking in the final solution. Written in a compact form, the final similarity matrix is given by
Smatch=12(n11T+CCT).(46)
View Source
We summarize in Algorithm 3 the main steps of the Serial-Rank method of Fogel et al. for ranking via seriation.

Algorithm 3. Serial-Rank: An Algorithm for Apectral Ranking using Seriation [12]

Require: A set of pairwise comparisons Cij{1,0,1} or [–1,1]

Compute a similarity matrix as shown in (41)

Compute the associated graph Laplacian matrix

LS=DS,(47)
View Source

where D is a diagonal matrix D= diag (S1), i.e., Dii=nj=1Gi,j is the degree of node i in the measurement graph G.

Compute the Fiedler vector of S (eigenvector corresponding to the smallest nonzero eigenvalue of LS).

Output the ranking induced by sorting the Fiedler vector of S, with the global ordering (increasing or decreasing order) chosen such that the number of upsets is minimized.

In the generalized linear model setting, one assumes that the paired comparisons are generated according to a generalized linear model, where paired comparisons are independent, and item i is preferred to item j with probability

Pij=H(νiνj),(48)
View Source where νRn is a vector denoting the strength, rank, or skill level of the n players. In the context of the GLM model, Fogel et al. propose the following similarity matrix
Sglmi,j=k=1n1{mi,kmj,k>0}(1|Ci,kCj,k|2)+1{mi,kmj,k=0}2,(49)
View Source
where mi,k=1 if i and j played in a match, and 0 otherwise. We denote by SER-GLM the Serial-Rank algorithm based on the above GLM model given by (44). For the cardinal case, SER-GLM is roughly similar to SER, except in the complete graph case, when SER-GLM consistently outperforms SER under both the MUN and ERO noise models. We refer the reader to Figs. 5 and 6 for the numerical results showing how SER and SER-GLM compare to the existing and newly proposed methods.

Appendix C Rank Centrality

In recent work [13], Negahban et al. propose an iterative algorithm for the rank aggregation problem by estimating scores for the items from the stationary distribution of a certain random walk on the graph of items, where edges encode the outcome of pairwise comparisons. The authors propose this approach in the context of the rank aggregation problem, which, given as input a collection of sets of pairwise comparisons over n players (where each set is provided by an independent rating system) the goal is to provide a global ranking that is as consistent as possible with the given measurements of all k ranking systems. At each iteration of the random walk, the probability of transitioning from vertex i to vertex j is directly proportional to how often player j beat player i across all the matches the two players confronted, and is zero if the two players have never played a match before. In other words, the random walk has a higher chance of transitioning to a more skillful neighbors, and thus the frequency of visiting a particular node, which reflects the rank or the skill level of the corresponding players, is thus encoded in the stationary distribution of the associated Markov Chain. The authors of [13] propose the following approach for computing the Markov matrix, which we adjust to render it applicable to both ordinal and cardinal measurements, in the case of a single rating system. Note that in Section 6 where we discuss the rank aggregation problem in the context of multiple rating systems, we rely on the original version of the algorithm. For a pair of items i and j, let Y(l)ij be equal to 1 if player j beats player i, and 0 otherwise, during the lth match between the two players, with l=1,,k. The BTL model assumes that P(Y(l)ij)=wjwi+wj, where w represent the underlying vector of positive real weights associated to each player. The approach in [13] starts by estimating the fraction of times players j has defeated player i, which is denoted by

aij=1kl=1kYlij(50)
View Source as long as players i and j competed in at least one match, and zero otherwise. Next, consider the symmetric matrix
Aij=aijaij+aji,(51)
View Source
which converges to wjwi+wj, as k. To define a valid transition probability matrix, the authors of [13] scale all the edge weights by 1/dmax and consider the resulting random walk
Pij={1dmaxAij11dmaxkiAikifijifi=j,(52)
View Source
where dmax denotes the maximum out-degree of a node, thus making sure that each row sums to 1. The stationary distribution π is the top left eigenvector of P , and its entries denote the final numerical scores associated to each node, which, upon sorting, induce a ranking of the n players.

To render the above approach applicable9 in the case when k=1 of a single rating system (for both cardinal and ordinal measurement), but also when k>1 for the case of cardinal measurements, we propose the following alternatives to designing the winning probabilities aij given by (45), and inherently the final winning probability matrix A in (46). Once we have an estimate for A (given by (50) for ordinal data, respectively by (51) for cardinal data), we proceed with building the transition probability P as in (47) and consider its stationary distribution. Note that we also make these two new methods proposed below applicable to the setting of multiple rating systems k>1, by simply averaging out the resulting winning probabilities A(l)ij,l=1,,k, given by each rating system via (50) or (51), across all rating systems

Aij=1kl=1kA(l)ij,(53)
View Source and then consider the transition probability matrix P as in (47) and its stationary distribution.

Appendix C

C.1 Adjusted Rank-Centrality for Ordinal Measurements

To handle the case of ordinal measurements, we propose a hybrid approach that combines Serial-Rank and Rank-Centrality, and yields more accurate results than the former one, as it can be seen in Fig. 6. We proceed by computing the Smatch matrix as in the Serial-Rank algorithm (given by (41) and (42) in Appendix B) that counts the number of matching comparisons between i and j with other third reference items k . The intuition behind this similarity measure is that players that beat the same players and are beaten by the same players should have a similar ranking in the final solution. Note that Sijn for any pair of players10, and thus Sijn[0,1]

Note that, whenever Sij is very large, say Sijn, meaning the two players are very similar, then the quantity 1Sijn is small and close to zero, and thus a good proxy for the difference in the winning probabilities Aij and Aji defined in (46). In other words, if two players are very similar, it is unlikely that, had they played a lot of matches against each other, one player will defeat the other in most of the matches. On the other hand, if two players are very dissimilar, and thus Sij is close to zero and 1Sijn is close to one, then it must be that, had the two players met in many matches, one would have defeated the other in a significant fraction of the games. With these observations in mind, and in the spirit of (46) , we design the matrix A of winning probabilities such that

{Aij+Aji=1|AijAji|=1Sijn.(54)
View Source for a pair of players that met in a game. We lean the balance in favor of the player who won in the (single) direct match, and assign to him the larger winning probability. Keeping in mind that Aij should be a proxy for the fraction of times player j defeated player i (thus whenever Cij>0 it must be that Aij>Aji), the above system of equations (49) yields
Aij=112Sijn,12Sijn,0,ifCij>0ifCij<0ifCij=0.(55)
View Source
We remark that, in the case of outliers given by the ERO noise model (29) , our above proposed version of Rank-Centrality (denoted as RC), when used in the setting of multiple rating systems, performs much better than the original Rank-Centrality algorithm (denoted as RCO), as shown in the bottom plot of Fig. 9.

Appendix C

C.2 Adjusted Rank-Centrality for Cardinal Measurements

For the case of cardinal measurements, we propose a similar matrix A of winning probabilities, and incorporate the magnitude of the score into the entries of A. The intuition behind defining the winning probability is given by the following reasoning. Whenever Cij takes the largest possible (absolute) value (i.e., assume Cij=(n1), thus j defeats i by far), we define the winning probability that player j defeats player i to be largest possible, i.e., Aij=1, and in general, the larger the magnitude of Cij, the larger Aij should be. On the other hand, whenever Cij has the smallest possible (absolute) value (i.e., assume Cij=1), then the wining probability should be as small as possible, i.e., close to 12. With these two observations in mind, we define the winning probability matrix as

Aij=12+12Cijn1,1212Cijn1,0,ifCij>0ifCij<0ifCij=0.(56)
View Source

Appendix D Future Research Directions

We highlight in this section several possible future research directions that we believe are interesting and worth pursuing further.

Appendix D

D.1 Robustness to Noise of SVD-Based Ranking

An interesting research direction is the analysis of the robustness to noise of the SVD-based ranking method discussed in Section 2.3. We remark here that, for the Erdös-Rényi Outliers ERO(n,p,η) model (29), the following decomposition could render the SVD-Rank method amenable to a theoretical analysis using tools from the random matrix theory literature on rank-2 deformations of random matrices [33]. Note that the expected value of the entries of C is given by

ECij=(rirj)(1η)p,(57)
View Source in other words, EC is a rank-2 skew-symmetric matrix
EC=(1η)p(reTerT).(58)
View Source
Next, one may decompose the given data matrix C as
C=EC+R,(59)
View Source
where R=CEC is a random skew-symmetric matrix whose elements have zero mean and are given by
Rij=(rirj)[1(1η)p]q(rirj)(1η)p(rirj)(1η)pw.p.(1η)pw.p.ηpw.p.1p,(60)
View Source
whenever ij and where qUnif.[(n1),n1], which renders the given data matrix C decomposable into a low-rank (rank-2) perturbation of a random skew-symmetric matrix. The case p=1 which corresponds to the complete graph G=Kn simplifies (55), and is perhaps a first step towards a theoretical investigation.

Appendix D

D.2 Ranking with Soft (Edge) Constraints as a Generalized Eigenvalue Problem

We have seen in the previous Section 7 that adding hard constraints to the ranking problem improves the end results. A natural question to consider is how one may incorporate soft constraints into the problem formulation. Let H denote the Hermitian matrix of available pairwise cardinal rank-offsets after embedding in the angular space, and similarly let Q denote the Hermitian matrix of pairwise cardinal rank-offset constraints the user wishes to enforce, also after mapping to the angular embedding space. One possibility is to incorporate weights into either the spectral or SDP relaxation of the synchronization problem, in which case, the input to the synchronization formulation will be given by

H~=H+λQ,(61)
View Source where Q is the matrix of rank-offset (soft) constraints. The larger the parameter λ , the more weight is given to the available pairwise constraints, and thus the more likely they are to be satisfied in the final proposed solution. However, in certain instances one may wish to enforce a lower bound on how well the given constraints are satisfied, in which case we adjust the optimization problem as follows. We maximize the same quadratic form zHz as in (12), but subject to the condition that the given constraints are satisfied well enough, i.e., zQzα, for a user specified α, and that the sum of the absolute values of all complex numbers z_i is n \begin{equation} \begin{aligned} & \underset{\boldsymbol{z} = (z_1, \ldots, z_l) \in \mathbb {C}^n}{\text{max}} & & z^* H z & \\ & \text{subject to} & & z^* Q z \ge \alpha \\ & & & z^* z = n. \end{aligned} \end{equation}
View Source

We remark that this approach is extremely similar to the one considered in the constrained clustering algorithm proposed in [34], where the goal is to cluster a given weighted graph with adjacency matrix H, subject to soft constraints captured in a sparse matrix Q, of the form Q_{ij}=1 (respectively, Q_{ij}=-1) if nodes i and j should be (respectively, should not be) in the same cluster, and Q_{ij}=0 if no information is available. As shown in [34], one arrives at a generalized eigenvalue problem based on the matrices H and Q. However, instead of working with the above H and Q matrices (which in our setting (56) are Hermitian matrices) one could instead consider their corresponding graph Connection Laplacians, and solve the resulting generalized eigenvalue problem where both matrices are now symmetric diagonally dominant. We point out our recent work [28] , where we present a principled spectral approach to the above well-studied constrained clustering problem, and reduce it to a generalized eigenvalue problem involving Laplacian matrices that can be solved in nearly-linear time due to recent progress in the area of Laplacian linear systems solvers. In practice, this translates to a very fast implementation that consistently outperforms the approach in [34], both in terms of accuracy and running time. It would be interesting to explore whether our approach in [28] can be extended to the setting of constrained ranking.

Appendix D

D.3 (Signless) Ranking with Unsigned Cardinal Comparisons

Consider for example the scenario where one is given the score sheet of all soccer games in the England Football Premier League, recording the goal difference for each game, but without disclosing who won and who lost the game. In other words, one wishes to reconstruct the underlying rankings (up to a global reversal of the ordering) only using information on the magnitude of the rank offsets \begin{equation} C_{ij} = | r_i - r_j |. \end{equation}

View Source We remark that the resulting problem is nothing but an instance of the graph realization problem on the line, with noisy distances [20]. In practice, an instance of this problem arises in shotgun genome sequencing experiments, where the goal is to reorder substrings of cloned DNA strands (called reads) using assembly algorithms that exploit the overlap between the reads. We refer the reader to the recent work of Fogel et al. on convex relaxations for the seriation problem [35], including an application to the above shotgun gene sequencing task. We remark that preliminary numerical results indicate that SVD-based ranking performs remarkably well in this setting of recovering the ordering (up to a global sign) using information only on the magnitude of the rank offsets but not their sign, and defer this investigation to future work.

Appendix D

D.4 Ranking Both Items and Raters

Another possible direction concerns the detection of outlier voters, in the setting when there are multiple voting systems providing pairwise comparisons on the same set of players. For example, in Section 6 we considered the problem of rank aggregation, of combining information from multiple voters or rating systems, which provide incomplete and inconsistent rankings or pairwise comparisons between the same set of players or items. A natural question asks whether it is possible to derive a reputation for both raters and items, and identify outlier voters of low credibility. Furthermore, one can incorporate the credibility of the voters when estimating the reputation of the objects, as considered in the recent work of de Kerchove and Van Dooren [36], who introduce a class of iterative voting systems that assign a reputation to both voters and items, by applying a filter to the votes. Another potential application is to crowdsourcing [37], where crowd workers have different levels of competencies and are likely to produce noisy judgments regarding ranking of items.

Appendix D

D.5 Ranking via l_1 Angular Synchronization

Singer [8] and Yu [9] observe that embedding in the angular space is significantly more robust to outliers when compared to l_1 or l_2 based embeddings in the linear space. Since the existing spectral and SDP relaxations for the angular synchronization problem [8] approximate a least squares solution in the angular space, a natural question to ask is whether an l_1-based formulation of the angular synchronization problem could further improve the robustness to noise or sparsity of the measurement graph. A preliminary investigation of such an l_1 based approach for angular synchronization (which we did not further apply to the ranking problem) using the splitting orthogonality constraints approach of Lai and Osher [38] based on Bregman iterations, suggests that it often yields more accurate results than the existing spectral relaxation of its least-squares solution only for favorable noise regimes. At high levels of noise, the spectral relaxation performs better, and does so at a much lower computational cost. On a related note, we point out the SDP relaxation of the Least Unsquared Deviation formulation [18], for which the authors give exact recovery guarantees under the Erdös-Rényi Outliers model.

Appendix D

D.6 Ranking over the Sphere and Synchronization over SO(3)

Another research direction we find interesting concerns the possibility of casting the ranking problem as a synchronization problem over the group SO(3), as opposed to SO(2) which we considered in this paper. Consider for example a sphere centered at the origin, with points on the sphere corresponding to the players and the z-axis to their ranks, such that the closer a player is to the South Pole, the higher her or his ranking is. The extra dimension would facilitate the extraction of partial rankings, which one can collect by traversing along longitudinal lines and their immediate neighborhoods. In addition, one can also extract sets of players which are about the same rank (roughly speaking, they have about the same latitude), but cannot be positioned accurately with respect to each other (for example, a set of 10 players out of 100, are ranked in positions 81\hbox{-}90, but based on the data, their relative ordering cannot be established). In other words, if one considers a slice of the sphere (parallel to the xy-plane) all the players on or nearby it would have about the same rank. Perhaps most important, it would also be interesting to investigate the extent to which the extra dimension increases the overall robustness to noise of the algorithm. Note that, while for SO(2), finding an optimal circular permutation was required to eliminate the additional degree of freedom, for the case of SO(3) one could search for the best rotation around the origin of the sphere, such that the resulting rank offsets agree with the initial data as best as possible.

Appendix D

D.7 Robustness to Noise

Another future direction would be to aim for guarantees in terms of robustness to noise. We remark that existing theoretical guarantees from the recent literature on the group synchronization problem by various works of Singer et al. [8], [14], [26], trivially translate to lower bounds on the largest amount of noise permissible in the cardinal comparisons for an Erdös-Rényi outliers noise model, while still achieving an approximate solution that correlates well with the underlying ground truth ranking. However, note that a perfect or close-to-perfect recovery of the angles in the angular synchronization problem is not a necessary condition for a similar recovery of the underlying ground truth ranking, since it is enough that only the relative ordering of the angles is preserved. Furthermore, the main theorem of [14] provides an l_\infty error on the MLE estimator for angular synchronization, which, when compared to the minimal distance between two consecutive angles, could lead to theoretical bounds for exact recovery of the rankings.11

Another interesting possible direction is to investigate the robustness to noise of Sync-Rank in the setting of the, perhaps more realistic, multiplicative uniform noise model (28) and explain the empirical findings in Figs. 4a and 4b. Finally, the surprising performance (relative to other methods) of Sync-Rank based on rank offsets given by (24) and illustrated in Fig. 8 of Section 5.4 across 30 years of data, raises the question of how does a disc measurement graph affect the performance of the algorithms, i.e., in the (often very realistic) setting where nodes (players) lie on a one-dimensional line, and there exists an edge (i.e., match) between two nodes if and only if they are at most r units (ranks) apart.

Fig. 4.

The outcome of numerical experiments for the spectral relaxation of Sync-Rank, as we vary both the sparsity p in the measurement graph and the noise level \eta in the data, for both cardinal (left) and ordinal (right) measurements, and under both the MUN (top) and ERO (bottom) noise models, given by (28), respectively, (29) . The color of the heatmaps encodes the Kendall distance between the estimated and the ground truth ranking. We fix n=500 and average the results over 10 runs. For the ERO model, we overlay on the heatmap the curve given by the threshold probability for recovery O (\frac{1}{\sqrt{np}}) due to Singer [8], which, for cardinal measurements, is in excellent agreement with the empirical findings.

Fig. 5.

Comparison of all ranking algorithms in the case of cardinal comparisons for the MUN (n=200,p,\eta) and ERO (n=200,p,\eta) noise models.

Fig. 6.

Comparison of all methods for ranking with ordinal comparisons for the MUN (n=200,p,\eta) and ERO (n=200,p,\eta) noise models.

Fig. 7.

Number of upsets Q^{(u)} (lower is better) for the English Premier League 2011-2014 data set, where the input is given by C^{nw} (top left) C^{snw} (top right), C^{tpd} (bottom left), and C^{stpd} (bottom right).

Fig. 8.

Comparison of the methods on the NCAA 1985-2014 Basketball data set, in terms of the number of upsets (30), across the years (top) and averaged over the entire period (bottom) for both cardinal and ordinal measurements.

Fig. 9.

Comparison of results for the Rank-Aggregation problem with m=5 rating systems, for the Multiplicative Uniform Noise model MUN( n=100,p,\eta) (top) and the Erdös-Rényi Outliers model ERO ( n=100,p,\eta) (bottom). We average the results over 10 runs.

Fig. 10.

Left: Searching for the best cyclic shift of the free players, shown in red. The black nodes denote the anchor players, whose rank is known and stays fixed. Right: The ground truth ranking.

Fig. 11.

Performance of Sync-Rank with constraints, for the MUN(n=200,p,\eta ) noise model, with p=0.2 (left) and p=1 (right). SYNC-SDP-K denotes the SDP relaxation of Sync-Rank with K constraints.

Fig. 12.

Recovering planted rankings. The top (respectively, bottom) subplot corresponds to the recovered rankings of the non-\Lambda-players, respectively the \Lambda-players.

Fig. 13.

The residual matrices given by (36), for each of the six methods, for a random instance of the ensemble \mathcal {G}(n=250, \beta =0.3, \eta _1=0, \eta _2=1). The \Lambda-nodes correspond to the first \beta n = 75 positions.

Fig. 14.

The recovered rankings by each of the six methods, for a random instance of the ensemble \mathcal {G}(n=250, \beta =0.3, \eta _1=0, \eta _2=1), for which |\Lambda | = \beta n = 75. For each plot, the top subplot, respectively bottom subplot, corresponds to the recovered rankings of the (ground truth) \Lambda ^C-nodes, respectively the (ground truth) \Lambda-nodes. Note that, in practice, the set \Lambda is not known a-priori, but we use this information here only for the purpose of making the point that the synchronization-based method is able to perfectly preserve the relative ranking of the \Lambda-players. We exploit this phenomenon, and are able to recover the planted set \Lambda in a totally unsupervised manner.

Fig. 15.

Comparison of the methods in terms of the Jaccard Similarity Index (higher is better) between the recovered \hat{\Lambda} and the ground truth \Lambda, from the ensembles given by \mathcal {G}(n=250,p,\beta,\eta _1, \eta _2=1), for varying p = \lbrace 0.2, 0.5, 1 \rbrace and \eta _1 = \lbrace 0, 0.2, 0.4\rbrace. Experiments are averaged over 15 runs.

Fig. 16.

The Kendall Distance (lower is better) between the recovered ranking of the \hat{\Lambda}-nodes by each of the methods (from Fig. 15) and their ground truth values, for the ensemble given by \mathcal {G}(n=250,p,\beta,\eta _1, \eta _2=1), for varying p = \lbrace 0.2, 0.5, 1 \rbrace and \eta _1 = \lbrace 0, 0.2, 0.4\rbrace. Experiments are averaged over 15 runs.

Appendix D

D.8 The Minimum Linear Arrangement Problem

One could also investigate any potential connections between synchronization and SVD-based ranking using only magnitude information (as detailed in Section D.3) and the Minimum Linear Arrangement (MLA) problem, which is defined as follows. Given a graph G=(V,E) and positive edge weights w:E \;\mapsto \;\mathbb {R}_{+}, a linear arrangement is a permutation \pi: V \;\mapsto \;\lbrace 1,2,\ldots,n\rbrace. The cost of the arrangement is given by \begin{equation} \sum _{ij \in E} w(u,v) | \pi (u) - \pi (v) |. \end{equation}

View Source In the MLA problem, the goal is to find a linear arrangement of minimum cost a task known to be NP-complete. In the approximation theory literature, the work of Feige and Lee [39] provides an O(\sqrt{\log n} \log \log n)-approximation SDP-based algorithm for the MLA problem.

Appendix E English Premier League Standings

In Table 2 we show the rankings obtained by each algorithm for the English Premier League 2013-2014 data set, based on the input matrix C^{nw} which counts the number of net wins between a pair of teams. We also present the final official ranking at the end of the season, which are denoted by GT. The bottom row shows the correlation of the result obtained by each algorithm with the ground truth ranking.12

    References

    1.L. Page, S. Brin, R. Motwani and T. Winograd, "The PageRank citation ranking: Bringing order to the web", Proc. 7th Int. World Wide Web Conf., pp. 161-172, 1998.
    2.L. Xiong and L. Liu, "A reputation-based trust model for peer-to-peer e-commerce communities", Proc. IEEE Int. Conf. E-Commerce, pp. 275-284, 2003.
    3.P. Cremonesi, Y. Koren and R. Turrin, "Performance of recommender algorithms on top-n recommendation tasks", Proc. 4th ACM Conf. Recommender Syst., pp. 39-46, 2010.
    4.R. A. Bradley and M. E. Terry, "Rank analysis of incomplete block designs: I. The method of paired comparisons", Biometrika, vol. B-39, pp. 324-345, 1952.
    5.R. D. Luce, Individual Choice Behavior a Theoretical Analysis, Hoboken, NJ, USA:Wiley, 1959.
    6.H. Azari Soufiani, W. Chen, D. C. Parkes and L. Xia, "Generalized method-of-moments for rank aggregation", Proc. Annu. Conf. Neural Inf. Process. Syst., pp. 2706-2714, 2013.
    7.F. L. Wauthier, M. I. Jordan and N. Jojic, Proc. Int. Conf. Mach. Learn., pp. 109-117, 2013.
    8.A. Singer, "Angular synchronization by eigenvectors and semidefinite programming", Appl. Comput. Harmon. Anal., vol. 30, no. 1, pp. 20-36, 2011.
    9.S. Yu, IEEE Trans. Pattern Anal. Mach. Intell., pp. 158-173, Jan. 2012.
    10.B. Osting, J. Darbon and S. Osher, "Statistical ranking using the l1-norm on graphs", vol. 7, no. 3, pp. 907-926, 2013.
    11.X. Jiang, L.-H. Lim, Y. Yao and Y. Ye, "Statistical ranking and combinatorial Hodge theory", Math. Programm., vol. 127, no. 1, pp. 203-244, 2011.
    12.F. Fogel, A. d’Aspremont and M. Vojnovic, "Serialrank: Spectral ranking using seriation", Proc. Adv. Neural Inf. Process. Syst. 27, pp. 900-908, 2014.
    13.S. Negahban, S. Oh and D. Shah, "Iterative ranking from pair-wise comparisons", Proc. Adv. Neural Inf. Process. Syst. 25, pp. 2474-2482, 2012.
    14.A. Bandeira, N. Boumal and A. Singer, "Tightness of the maximum likelihood semidefinite relaxation for angular synchronization", arXiv preprint arXiv:1411.3272, 2014.
    15.D. Gleich, "SVD based term suggestion and ranking system", Proc. IEEE Int. Conf. Data Mining, pp. 391-394, 2004.
    16.A. N. Hirani, K. Kalyanaraman and S. Watts, "Least squares ranking on graphs", arXiv preprint arXiv:1011.1716, 2010.
    17.D. Féral and S. Péché, "The largest eigenvalue of rank one deformation of large Wigner matrices", Commun. Math. Phys., vol. 272, no. 1, pp. 185-228, 2007.
    18.L. Wang and A. Singer, "Exact and stable recovery of rotations for robust synchronization", Inf. Inference, vol. 2, no. 2, pp. 145-193, 2013.
    19.S. Zhang and Y. Huang, "Complex quadratic optimization and semidefinite programming", SIAM J. Optim., vol. 16, no. 3, pp. 871-890, 2006.
    20.M. Cucuringu, Y. Lipman and A. Singer, "Sensor network localization by eigenvector synchronization over the Euclidean group", ACM Trans. Sens. Netw., vol. 8, no. 3, pp. 19:1-19:42, Aug. 2012.
    21.A. Singer and H.-T. Wu, "Vector diffusion maps and the connection Laplacian", Commun. Pure Appl. Math., vol. 65, no. 8, pp. 1067-1144, 2012.
    22.L. Vandenberghe and S. Boyd, "Semidefinite programming", SIAM Rev., vol. 38, pp. 49-95, 1994.
    23.S. Boyd, N. Parikh, E. Chu, B. Peleato and J. Eckstein, "Distributed optimization and statistical learning via the alternating direction method of multipliers", Found. Trends Mach. Learn., vol. 3, no. 1, pp. 1-122, Jan. 2011.
    24.Z. Wen, D. Goldfarb and K. Scheinberg, "Block coordinate descent methods for semidefinite programming", Handbook on Semidefinite Conic and Polynomial Optimization, pp. 533-564, 2012.
    25.N. Boumal, "A Riemannian low-rank method for optimization over semidefinite matrices with block-diagonal constraints", arXiv preprint arXiv:1506.00575, 2015.
    26.A. S. Bandeira, A. Singer and D. A. Spielman, "A Cheeger inequality for the Graph Connection Laplacian", SIAM J. Matrix Anal. Appl., vol. 34, no. 4, pp. 1611-1630, 2013.
    27.M. Cucuringu, A. Singer and D. Cowburn, Inf. Inference, pp. 21-67, 2012.
    28.M. Cucuringu, I. Koutis and S. Chawla, "Scalable constrained clustering: A generalized spectral method", Artif. Intell. Statist. Conf., 2016.
    29.J. R. Lee, S. Oveis Gharan and L. Trevisan, "Multi-way spectral partitioning and higher-order Cheeger inequalities", Proc. 44th Annu. ACM Symp. Theory Comput., pp. 1117-1130, 2012.
    30.U. Feige and M. Seltzer, "On the densest k-subgraph problem", 1997.
    31.N. Alon, M. Krivelevich and B. Sudakov, Proc. 9th Annu. ACM-SIAM Symp. Discrete Algorithms, pp. 594-598, 1998.
    32.J. E. Atkins, E. G. Boman and B. Hendrickson, "A spectral algorithm for seriation and the consecutive ones problem", SIAM J. Comput., vol. 28, pp. 297-310, 1998.
    33.F. Benaych-Georges and R. R. Nadakuditi, "The singular values and vectors of low rank perturbations of large rectangular random matrices", J. Multivariate Anal., vol. 111, pp. 120-135, 2012.
    34.X. Wang, B. Qian and I. Davidson, "On constrained spectral clustering and its applications", Data Mining Knowl. Discovery, vol. 28, no. 1, pp. 1-30, 2014.
    35.F. Fogel, R. Jenatton, F. Bach and A. d’Aspremont, "Convex relaxations for permutation problems", Proc. Adv. Neural Inf. Process. Syst., pp. 1016-1024, 2013.
    36.C. de Kerchove and P. Van Dooren, "Iterative filtering in reputation systems", SIAM J. Matrix Anal. Appl., vol. 31, no. 4, pp. 1812-1834, Mar. 2010.
    37.D. R. Karger, S. Oh and D. Shah, Proc. Adv. Neural Inf. Process. Syst. 24, pp. 1953-1961, 2011.
    38.R. Lai and S. Osher, "A splitting method for orthogonality constrained problems", J. Sci. Comput., vol. 58, no. 2, pp. 431-449, Feb. 2014.
    39.U. Feige and J. R. Lee, "An improved approximation ratio for the minimum linear arrangement problem", Inf. Process. Lett., vol. 101, no. 1, pp. 26-29, Jan. 2007.