Table Of ContentMulticlass MinMax Rank Aggregation
Pan Li and Olgica Milenkovic
ECE Department, University of Illinois at Urbana-Champaign
Email: [email protected], [email protected]
Abstract—We introduce a new family of minmax rank aggre- distances between π and the elements of Σk) or a minimum
gationproblemsundertwodistancemeasures,theKendallτ and distance (which equals the smallest distance between π and
theSpearmanfootrule.AstheproblemsareNP-hard,weproceed an element in Σk). The above described MinMax problem
to describe a number of constant-approximation algorithms for
is motivated by a number of applications in which classes of
solving them. We conclude with illustrative applications of the
aggregation methods on the Mallows model and genomic data. rankings arise due to different ranking criteria or properties
7
of the ranking entities (social platforms) or due to prior
1 I. INTRODUCTION
0 knowledgeofdifferentsimilaritydegreesingroupsofrankings
Rankings, a special form of ordinal data, have received
2 (genome evolution). The minmax criteria is typically used
significant attention in the machine learning community as
when trying to ensure that the aggregate violates each vote
n
theyariseina numberofimportantapplicationdomains,such
a (class of votes) to roughly the same extent.
asrecommendersystems,socialvotingandproductplacement
J WestartouranalysiswiththeMinMaxproblemwithC =1
platforms. Of particular importance are rankings of the form
8 andunderthemedianandminimumdistance,andthenproceed
2 of linear orders (permutations) and partial rankings (weak
to study the problemfor the case of arbitraryvaluesof C and
orders), which are frequently obtained through conversion
m , k = 1,...,C. For both the case of the Kendall τ as
] fromratings.Oneof themainprocessingtasksforrankingsis k
G well as the Spearman footrule in the median and minimum
rank aggregation,which often involvesevaluating the median
L distance setting, the MinMax problems may be shown to
of a set of permutations or partial rankings under a suitably
. be NP-hard by using the corresponding results of [3]. In
s chosen distance function [2], [4], [7], [9], [11], [12], [16].
particular, the work in [3] outlines a general framework for
c
The median rank aggregation problem under the Kendall τ
[ provingNP-hardnessresults for the median, single class min-
distance was introduced by Kemeny [11], and was proved to
max-aggregation problem under different ranking distances.
1 beNP-hardbyBartholdietal.[4].Anumberofapproximation
v Nevertheless,onlyahandfulofapproximationalgorithmswere
algorithmsfortheproblemhavebeendescribedin [2],mostly
5 proposedevenforthisbasicmin-max-aggregationform:Tothe
pertainingtopermutations;acorrespondingPTAS(polynomial
0 best of our knowledge, the only provable algorithm for the
3 timeapproximationscheme)wasproposedin[12].Inthecon- single class MinMax under the minimum distance measure
8 textofpartialrankingaggregation,knownsolutionsincludethe
was providedin [3]. The algorithmtakes the form of the well
0 results of [1], [10]. Median aggregation under other distance
studied ”pick-a-permutation” method, and tends to perform
.
1 functions has received less attention, one notable exception
poorly in practice.
0 being the Spearman rank aggregation problem [7], which
The main results of our work include families of con-
7 is known to provide a constant approximation for Kendall
1 stant approximation algorithm for the new, general family
τ aggregation using a polynomial time algorithm based on
: of multiclass MinMax problems, both under the median and
v weighted bipartite matching [9].
minimum class distance, evaluated using the Kendall τ and
Xi We propose to investigate a broad new family of rank Spearman footrule. Furthermore, we illustrate the use of the
aggregation problems in which the median is replaced by a
r new aggregation paradigm on the problem of finding an
a minmax type of function and where the rankingsare grouped
ancestral genome arrangement for mitochondrial DNA under
in classes. More precisely, assume that there are C > 1
the tandem duplication model for genomes [6].
differentclassesofrankingsandletΣk ={σk,σk,...,σk }be
1 2 mk
the set of mk =|Σk| rankings belonging to the class labeled II. MATHEMATICALPRELIMINARIES
by k ∈ [C]. Our minmax rank aggregation problem may
Let S denote a set of n elements, which without loss of
be succinctly described as follows: Output a ranking π that
generality we set to [n] ≡ {1,2,...,n}. A ranking is an
agreesintheminmaxsensewiththerankingsbelongingtothe
ordering of a subset of elements Q of [n] according to a
different classes. Rigorously, we seek to solve the following
predefinedrule. When Q=[n], the resulting order is referred
optimization problem:
to as a permutation. When the rankings include ties, they are
MinMax: minmaxλ d(π,Σk), referred to as partial rankings [10].
k
π k More precisely, a permutation is a bijection σ : [n]→[n],
where λ > 0 represent the costs of violating the agreement andthesetofpermutationsover[n]formsthesymmetricgroup
k
with rankings in class k. In the above formulation, d(π,Σ) ofordern!,denotedbyS .Foranyσ ∈S andx∈[n],σ(x)
n n
stands for a distance between a ranking or partial ranking π denotestherank(position)oftheelementxinσ.Wesaythatx
and a set of rankings Σk, and it may be chosen to be of the isrankedhigherthany (rankedlowerthany) iffσ(x)<σ(y)
form of a median distance (which equals the total sum of (σ(x)>σ(y)). The inverse of a permutation σ is denoted by
σ−1 :[n]→[n].Clearly,σ−1(t)representstheelementranked DefinitionII.1. SupposethatπisarankingandthatΣisaset
at position t in σ. Similarly, partial rankings [10] represent a of rankings. Given a distance between two rankings d (·,·),
⋆
mappingover[n]inwhichtheremayexisttwoelementsx6=y the median-⋆ distance (med−⋆) between π and Σ equals
such that σ(x) = σ(y). It is common to use σ(x) to denote
1
the position of the element x in the partial ranking σ, and to dmed−⋆(π,Σ)= Xd⋆(π,σ).
|Σ|
define it as σ∈Σ
DefinitionII.2. SupposethatπisarankingandthatΣisaset
σ(x),|{y ∈[n]:y is ranked higher thanx}|
of rankings. Given a distance between two rankings d (·,·),
⋆
1 the min-⋆ distance (min−⋆) between π and Σ is defined as
+ (|{y ∈[n]:y is tied withx}|+1). (1)
2
d (π,Σ)=mind (π,σ).
min−⋆ ⋆
A number of distance functions between rankings were σ∈Σ
proposedintheliterature[7],[10],[14].Onedistancefunction We recall that the focal problem of this work is to find
countsthenumberofadjacenttranspositionsneededtoconvert constant approximation algorithms for the MinMax rank ag-
a permutation into another. Adjacent transpositions generate gregation problem, which reads as
S ,i.e.,anypermutationπ ∈S canbeconvertedintoanother
n n MinMax: minmaxλ d(π,Σk),
permutation σ ∈ Sn through a sequence of adjacent transpo- π k k
sitions [14]. The smallest number of adjacent transpositions
where d(π,Σk) is a med−⋆ or min−⋆ distance, with ⋆ ∈
needed to convert a permutation π into another permutation
{τ,S,K,prS}. In our future analysis we use λ∗ , max λ
σ is termed the Kendall τ distance, denoted by d (π,σ). The k k
τ and M , {k : λ = λ∗}. Furthermore, we let π∗ denote the
Kendall τ distance between two permutations π and σ over k
argumentoftheoptimalsolutionoftheMinMaxproblemand
[n] also equals the numberof pairwise inversionsof elements
let W =max λ d(π∗,Σk).
of the two permutations: k k
III. APPROXIMATE MINMAX AGGREGATION
d (σ,π)=|{(x,y):π(x)>π(y),σ(x)<σ(y)}|. (2)
τ Aspreviouslypointedout,theMinMaxproblemunderboth
the med−⋆ and min−⋆ can be shown to be NP-hard using
AnotherpositionaldistancemeasureistheSpearmanfootrule,
the results of [3], which established hardness for the special
dS(σ,π)= X |σ(x)−π(x)|. case mk = 1 and d(·,·) a pseudometric. We hence focus on
devising approximation algorithms for the MinMax problem.
x∈[n]
A. Permutations
It can be shown that d (π,σ)6d (π,σ)62d (π,σ) [7].
τ S τ
One may similarly define a generalization of the Kendall We first consider ordinal data of the form of permutations.
τ distance for partial rankings π and σ over the set [n]. This We show that a simple algorithm, which we term Pick-Rnd-
distance is known as the Kemeny distance, and equals Perm, can achieve a 2-approximation in expectation for the
case of the med−⋆ problem whenever d (·,·) is a pseudo-
⋆
d (π,σ)=|{(x,y):π(x)>σ(y),π(x)<σ(y)}| metric.Then,for⋆∈{τ,S},wedescribetwo2-approximation
K
1 algorithmsthatuse a combinationof convexoptimizationand
+ |{(x,y):π(x)=π(y),σ(x)>σ(y)
2 rounding procedures and offer significantly better empirical
or π(x)>π(y),σ(x)=σ(y)}|. (3) performance than random selection. Finally, we describe a
2-approximation algorithm for the min−⋆ problems when
The Spearman footrule analogue for partial rankings [10] d (·,·) is a pseudometric. The selection algorithm essentially
⋆
equalsthesumoftheabsolutedifferencesbetween“positions” transforms the min − ⋆ problem into a med− ⋆ problem:
of elements in the partial rankings, Thus, the algorithms developed for approximating multiclass
med−⋆problemsmaybeusedtoapproximatecorresponding
dprS(σ,π)= X |σ(x)−π(x)|, instances of the min−⋆ problem.
x∈[n] The Pick-Rnd-Perm Algorithm. Pick a permutation π
from ∪ Σ uniformly at random.
where positions are as defined in (1). The Spearman footrule k∈M k
distance for partial rankings is a 2-approximation for the Theorem III.1. For the d (·,·) distance, where d is
med−⋆ ⋆
Kemeny distance [10]. a pseudometric, the Pick-Rnd-Perm algorithm produces a 2-
The notion of a distance between two rankings has an approximation of the med−⋆ problem.
important extension in terms of a distance between a ranking
Proof: For a given k,
and a set of rankings,which we referto as rank-setdistances.
We focus our attention on two types of rank-set distances, λ mk
defined below. For compactness, we use ⋆ to denote an λkdmed−⋆(π,Σk)= mk Xd⋆(π,Σk)
k
arbitrarydistance onpairsofrankings,butfocusourattention i=1
6λ d (π,π∗)+d (π∗,Σk) 6λ∗d (π,π∗)+W.
throughoutthe paper on ⋆∈{τ,S,K,prS}. k(cid:2) ⋆ m−⋆ (cid:3) ⋆
By calculating the expectation, we obtain v) whose positions are determined by pivoting on v. Define
E[maxλkdmed−⋆(π,Σk)]6λ∗E[d⋆(π,π∗)]+W 62W. Pv(u)={(x,y):x, y ∈Vv, hvxhyv =1},
k
Akv(u)= X(hxvwvkx+hvxwxkv)+ X wxky,
Clearly, random selection may be improved by picking
x∈Vv (x,y)∈Pv
the optimal permutation from ∪ Σ instead. We term
this approach Pick-Opt-Perm. Altkh∈oMughkthe Pick-Rnd (Opt) Bvk(u)= X(uxvwvkx+uvxwxkv)+ X (uxywykx+uyxwxky).
-Perm algorithms are exceptionally simple and offer a 2- x∈Vv (x,y)∈Pv
approximation to the optimal solution, they have a number The rounding procedure makes iterative calls to the the fol-
of drawbacks, including the fact that the aggregate is a given lowing routine.
ranking from the clusters, which violates fairness rules of
mmKT-Conv (V,u)
aggregates, and that its empirical performance is typically
very poor. To mitigate these problems, we propose more 1:Choosethepivotv∈V according tov=argmina maxk BAakak((uu)).
sophisticated aggregation algorithms for both the med − τ 2:SetVL=∅,VR=∅.
3:Forallx∈Vv:
and med−S problems. Ifhxv=1,VL←VL∪{x}.Otherwise, VR←VR∪{x}.
Case I: d = dmed−τ. For C = 1, a well known method 4:Return[mmKT-Conv(VL,u),v,mmKT-Conv(VR,u)].
termedrandompivotingproposedbyAilonetal.[1],[2]offers
a 2-approximation in expectation for both the permutation Theorem III.2. The iterative application of the mmKT-Conv
and partial rank aggregation problem. In random pivoting, at algorithm outputs a permutation with at most twice the cost
each step, one element in the ranking is chosen uniformly at of the optimal solution of the linear program (4).
random and the remaining elements are partitioned based on At each iteration of rounding, Ak(u) denotes the cost of
v
the pairwise comparisonwith the pivotelement. However,for rounding incurred by the class k of rankings, while Bk(u)
v
thecaseoftheMinMaxproblemwithC >1,randompivoting
denotes the associated cost of the linear program for class k.
maybeinadequate:Thedifficultyliesinthefactthatrankings
Hence,thegoalistoprovethatforthegivenchoiceofthepivot
indifferentclassesmayleadtowidelydisparatepairwisepivot v, we have Ak(u) 6 2Bk(u) for all k ∈ [C]. Suppose that
v v
comparisons. Another problem in this context is that while k′ is the index of the class that maximizes Akv(u) at the first
one may achieve a constant approximation in expectation step of mmKT-Conv. Then, it suffices to shoBwvk(tuh)at Ak′(u)6
for each class individually, the largest cost among classes v
2Bk′(u). This result is a corollary of the following lemma.
may not be bounded due to the exchange of the expectation v
and maximization operators. Therefore, instead of pivoting, Lemma III.3. Ak(u)62 Bk(u), ∀k ∈[C].
Pv∈V v Pv∈V v
one must resort to a different approach to the problem. Our
Proof: To prove the claimed result, it suffices to prove
approach is to deterministically round the fractional solution
that for any two distinct elements x,y, one has
of a specific convex optimization problem. The deterministic
rounding procedure is motivated by ideas in [16]. h w +h w 62(u w +u w ), (5)
Let wk , λk mk 1{σk(x) < σk(y)}, where 1 stands xy yx yx xy xy yx yx xy
for the ixnydicatomrkfuPnci=ti1on, anid let wki = 0 for all x,k. For and for any triple of distinct elements x,y,z, one has
xx
a given ranking π, also define the variables u ,1{π(x)<
xy Xhxzhzywyx 62Xhxzhzy(uxywyx+uyxwxy), (6)
π(y)}. The MinMax problem may be stated as
where the summation is circular over all permutations of
min q x,y,z. Both summations are taken over all possible permuta-
u,q
tions of the two (three) elements in the argument.
s.t. X wxkyuyx 6q for allk ∈[C] (4) The inequality (5) is easy to prove: Suppose that h =1.
xy
x,y∈[n] Then the sum on the left hand side equals w 6 2u w
yx xy yx
uxy ∈{0,1}, which is bounded by the right hand side expression. To
u +u =1 for alli,j ∈[n], i6=j prove the inequality (6), consider the six variables associ-
xy yx
u +u +u >1 for all distinctsx,y,z ∈[n] ated with x,y,z, namely hxy,hyx,hxz,hzx,hyz,hzy. These
xy yz zx
variablesmay be partitionedinto two classes, {h ,h ,h }
xy zx yz
Note that if the rankingsare permutations, then wxky+wykx = and {hyz,hxz,hzx}. There are at least three variables that
λ , which is a value that only depends on k. are 0’s. Without loss of generality, suppose that the class
k
The above integer program may be relaxed to a linear pro- {hxy,hzx,hyz} contains at least two 0’s.
gram by allowing uxy to take fractionalvalues. Upon solving Case 1: Assume that hxy,hzx,hyz = 0. Then, the difference
the linearprogram,oneneedsto roundthe valuesofu . The of the left and right hand side of the inequality under consid-
xy
next rounding procedure guarantees a 2-approximation. eration equals
Leth =1 ,ifx>y,andh =1−h ,ifx<y.
Let v bexya pivuoxtyin>g1/e2lement for the roxuynding pryoxcedure and (1−2uxy)wyx+(1−2uyz)wzy +(1−2uzx)wxz−
use P (u) to denote the set of pairs of elements (excluding 2u w −2u w −2u w .
v xz zx yx xy zy yz
The claimed result then follows from observing that (1 − λ d (π∗,σk)6W. As d is pseudometric, we have
k ⋆ 1 ⋆
2u )w 6u (w +w ).
xy yx yx yz zx λ λ
Case 2: Assume that hxy = 1,hzx,hyz = 0. The left hand λ k+jλ d⋆(σ1k,σ1j)6W.
side equals w 6 2u w which is clearly bounded from k j
yx xy yx
above by the right hand side expression as hxy =1. Next, choose an arbitrary k˜ ∈ M and let k′ =
Case II: d = dmed−S. When C = 1, the MinMax argmaxj∈[C]/{k˜}λjd⋆(σ1k˜,σ1j). Then,
aggregation problem may be solved in polynomial time via
weighted bipartite matching [9]. However, when C > 1, the min max λjd⋆(σ1k,σ1j)6λk′d⋆(σ1k˜,σ1k′)
problem is hard even if m =1 for all k [3]. k∈[C]j∈[C]/{k}
k
Step1:Ifweremovetheintegralconstraintonthepositionof 6 2λk˜λk′ d (σk˜,σk′)62max λkλj d (σk,σj).
elements in π, the optimization problem of interest is convex λk˜+λk′ ⋆ 1 1 k,j λk+λj ⋆ 1 1
and may be solved efficiently: Moreover, the output π of min-Pick-Perm satisfies
λ mk maxd (π,Σj)= min min max min d (σk,σj)
u∗ =um∈iRnnmkaxmkk X||u−σgk||1, (7) j∈[C] min−⋆ k∈[C]σik∈Σkj∈[C]/kσgj∈Σj ⋆ i g
g=1 6 min max λ d (σk,σj).
j ⋆ 1 1
where ||u−σk|| = |u(h)−σk(h)|. k∈[C]j∈[C]/{k}
g 1 Ph∈[n] g
Step 2 (mmSP-Conv): We assign positions to elements ac- The result follows by combining the above inequalities.
cording to the fractional solution u∗ as follows. If u∗(x) < RemarkIII.1. Let(i∗,k∗)betheoptimalindicesgeneratedby
u∗(y),we letπ(x)<π(y)foranytwodistinctelementsx, y, min-Pick-Perm. Define Σ˜k∗ ={σk∗} and let
i∗
with ties broken randomly.
Σ˜j ={σ ∈Σj :d⋆(σik∗∗,σ)=dmin−⋆(σik∗∗,Σj)}
Theorem III.4. mmSP-Conv rounding increases the cost of
for j ∈[C]/{k∗}. A c−approximatesolution for the med−⋆
the convex optimization problem (7) at most twice.
problem with input {Σ˜ } , denoted by π′, satisfies
k k∈C
Proof: First, we claim that the output of mmSP-Conv,
denotedbyπ ,isinΠ′ ,{π′ ∈Sn :||u∗−π′|| =min||u∗− maxλjdmin−⋆(π′,Σj)6 maxλjdmed−⋆(π′,Σ˜j)
S 1 j∈[C] j∈[C]
xπ,||1y}∈. T[nhi]ssfaotilslofywsπ(sxin)c>e fπor(ya)nyanrdanuk∗i(nxg)π<, iuf∗t(wyo),ewleemmenatys 6cmπinjm∈a[Cx]λjdmed−⋆(π′,Σj)6cjm∈a[Cx]λjdmed−⋆(σik∗∗,Σj)
transposexandy inπ toobtainasmaller||u∗−π||1.Second, 62cW.
for an arbitrary permutation σ, we have
Hence, π′ is a 2c−approximation for the original min−⋆
||πS −σ||1 6||πS −u∗||1+||σ−u∗||1 62||σ−u∗||1. problem. Therefore, convex optimization and rounding can
be used on the med−⋆ problem. We refer to these adapted
The claim follows by setting σ =σk, i∈[m ], k ∈[C].
i k algorithms as min-mmKT-Conv and min-mmSP-Conv.
Note thatthe integralitygapofthe problems(4) (7) is 2,as
onemayconsidertwoequallyweightedclasses,eachofwhich B. Partial rankings
contains one single ranking, (1,2,3,4,...) and (2,1,3,4,...), All the algorithms proposed for permutation aggregation
respectively. Hence, the best approximation constant via the generalizetopartialrankingaggregation.Onemayeasilyshow
useofucannotbelessthan2,whichimpliesthattheproposed that as long as the distance d defined for partial rankings
⋆
rounding is optimal. One may expect to achieve a smaller is a pseudometric (e.g., ⋆ ∈ {K,prS}), the 2-approximation
approximation constant by outputting the better of the two guarantees for all previous methods hold. To get a fractional
results produced by Pick-Rnd-Perm and mmKT(SP)-Conv. solution in the program of mmKT-Conv, we have to change
Thisapproachwillbediscussedinthefullversionofthepaper. the constraint (4) to
We introducenextthemin-Pick-Permalgorithmforsolving
1
the dmin−⋆ problem. 2Tk+ X wxkyuyx 6w for allk ∈[C],
x,y∈[n]
m1i:nF-Porickea-Pcherkm∈(ΣC1a,nΣd2,ea..c.h,ΣraCn)k,in(gλ1σ,kλ2∈,.Σ..k,λC). 1 mk
32::LeCto(im∗p,ukt∗e)S=coarerkig(=i,km)maxinj∈SCc/o{rekki}.iλOjumtpiuntσπsj∈=Σjσdik∗⋆∗(.σik,σsj). Tk = mk Xi=116xX<y6n1(σik(x)=σik(y)),
which does not depend on the type of output ranking. Also,
Theorem III.5. If d is pseudometric,then min-Pick-Permis notethatwk forpartialrankingsdoesnotsatisfy theequality
⋆ xy
a 2-approximationalgorithm for the min−⋆ problems. wxky+wykx =λk,althoughthetriangleinequalitywxy+wyz >
w still holds. As the proof of Theorem III.2 only requires
xz
Proof: By the definition of the min−⋆ problem, each
the later inequality, the same rounding procedure offers a 2-
class contains at least one permutation, which without loss
approximation.Also, in the optimization problem (7) one has
of generality we denote by σk ∈ Σk, k ∈ [C], that satisfies
1 to use the definition σ(x) for partial rankings.
TABLEI TABLEII
COMPARISONOFRANKAGGREGATIONMETHODS:OBJECTIVEVALUE MITOCHONDRIALDNA(MTDNA)AGGREGATION
(STANDARDDEVIATION) d Aggregated Sequences
med−τ
1107217123091123192021
A.d
med−τ mmKT-Conv 210 1335315142526616322834
φ1 0.5 0.7 0.9 1.0 4242718362931833225
mmKT-Conv 14.5(1.1) 16.3(1.4) 17.8(1.3) 17.9(1.5)
12721736203291011351230
Pick-Rnd-Perm 17.8(1.4) 19.9(2.1) 21.5(1.8) 21.6(2.1)
Pick-Opt-Perm 267 21919182833781626143413
Pick-Opt-Perm 15.9(1.8) 18.1(1.8) 20.0(1.7) 20.0(1.6)
24153225422236315
FASLP-Pivot 15.3(1.4) 17.7(2.1) 19.4(2.2) 19.7(2.3)
12177231232030216910
FASLP-Pivot 269 1115192825271832833241334
B.d
med−S 144352926163631225
φ1 0.5 0.7 0.9 1.0
mmSP-Conv 23.3(1.7) 26.0(2.1) 28.1(2.3) 28.4(2.2)
Pick-Rnd-Perm 27.0(2.4) 29.9(2.7) 32.4(2.7) 32.1(2.6) In our experiment,we used the mtDNA dataset from [5]. The
Pick-Opt-Perm 24.5(1.9) 27.5(2.4) 29.9(2.3) 29.9(2.1) dataset contains 11 metazoan genomes with 36 gene-blocks
SP-Matching 26.3(3.0) 30.5(3.6) 35.3(3.5) 35.9(3.4)
in some arrangement.We removedthe “signs” of gene orders
C.dmin−τ and let each genome represent one class, so that C = 11
φ1 0.5 0.7 0.9 1.0 and mk = 1 for all k; we fixed λk = 1. Table II shows the
min-mmKT-Conv 6.9(1.9) 8.6(2.3) 9.8(2.6) 10.0(2.4) results. Due to page limitations, we relegate the significantly
min-Pick-Perm 8.4(1.7) 10.5(1.9) 11.8(2.2) 12.0(1.8)
morespaceconsumingempiricalstudyofweightedmulticlass
FASLP-Pivot 9.3(1.9) 11.1(2.2) 12.9(2.7) 13.1(2.2)
mtDNA aggregation to the extended version of the paper.
D.dmin−S REFERENCES
φ1 0.5 0.7 0.9 1.0
min-mmSP-Conv 11.9(2.6) 14.1(2.8) 16.1(3.6) 16.3(3.1) [1] Nir Ailon. Aggregation of partial rankings, p-ratings and top-m lists.
min-Pick-Perm 13.9(2.4) 16.7(2.7) 18.9(3.0) 18.9(2.5) Algorithmica, 57(2):284–300, 2010.
SP-Matching 17.1(3.5) 22.4(4.3) 26.2(4.2) 27.2(4.0) [2] NirAilon, MosesCharikar, andAlantha Newman. Aggregating incon-
sistentinformation:rankingandclustering.JournaloftheACM(JACM),
55(5):23, 2008.
IV. SIMULATIONS [3] Christian Bachmaier, Franz J Brandenburg, Andreas Gleißner, and
Andreas Hofmeier. On the hardness of maximum rank aggregation
We compare the performance of three families of al- problems. Journal ofDiscreteAlgorithms,31:2–13, 2015.
gorithms: Convex optimization procedures with rounding [4] JohnBartholdiIII,CraigATovey,andMichaelATrick.Votingschemes
forwhichitcanbedifficulttotellwhowontheelection. SocialChoice
(mmKT-Conv, mmSP-Conv, min-mmKT-Conv, min-mmSP-
andwelfare,6(2):157–165, 1989.
Conv), permutation selection (Pick-Rnd-Perm, Pick-Opt- [5] Guillaume Bourque and Pavel A Pevzner. Genome-scale evolution:
Perm, min-Pick-Perm) and algorithms used for traditional reconstructing gene orders in the ancestral species. Genome research,
12(1):26–36, 2002.
min-median rank aggregation (FASLP-Pivot [2] and SP-
[6] KamalikaChaudhuri,KevinChen,RaduMihaescu,andSatishRao. On
Matching[9]).Thecomparisonshowsthatalgorithmsbasedon the tandem duplication-random loss model of genome rearrangement.
convexoptimizationyieldsignificantlybetterresultsthannaive In Proceedings of the seventeenth annual ACM-SIAM symposium on
Discrete algorithm, pages 564–570. Society forIndustrial and Applied
selection methods, and that traditionalaggregationalgorithms
Mathematics, 2006.
are poor candidates for solving MinMax problems. [7] Persi Diaconis and Ronald L Graham. Spearman’s footrule as a
First, we evaluate the proposed algorithms on synthetic measure of disarray. Journal of the Royal Statistical Society. Series
B(Methodological), pages262–268, 1977.
data. The synthetic data is generated based on what we call a
[8] Liviu PDinu andRadu Ionescu. Anefficient rank basedapproach for
two-levelMallowsmodel:First, we generatethe permutations closeststringandclosestsubstring. PLoSOne,7(6):e37576, 2012.
{σ1,...,σC} independentlybased on the Mallows distribution [9] Cynthia Dwork, Ravi Kumar, Moni Naor, and Dandapani Sivakumar.
P(σk) ∝ φdτ(σk,e) [13]. Then, for each class k ∈ [C], we Rank aggregation methods for the web. In Proceedings of the 10th
1 international conference on World Wide Web, pages 613–622. ACM,
generate m permutations σk,...,σk independently accord- 2001.
ingtotheMkallowsdistributio1nP(σikm)k∝φd2τ(σik,σk).Wesetthe [10] REroinkaVldeeF.aCgoinm,pRaraivnigKanudmaagr,grMegoahtianmgmraandkiMngashwdiiatnh,tiDes.SIinvaPkruomceaer,dianngds
number of classes to C = 3, fix φ = 0.7 and let each class of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on
2
contain m = 10 permutations. To control the distance be- Principles ofdatabasesystems,pages47–58.ACM,2004.
k [11] JohnGKemeny. Mathematics withoutnumbers. Daedalus,88(4):577–
tweendifferentclasses,wechooseφ1from{0.5,0.7,0.9,1.0}. 591,1959.
The objective function values for 100 independent samples, [12] Claire Kenyon-Mathieu and Warren Schudy. How to rank with few
errors. In Proceedings of the thirty-ninth annual ACM symposium on
obtained by different algorithms, are shown Table I.
Theoryofcomputing, pages95–103.ACM,2007.
Our next test example comes from evolutionary biology, [13] ColinLMallows. Non-nullrankingmodels.i.Biometrika,44(1/2):114–
and is concernedwith MitochondrialDNA (mtDNA) genome 130,1957.
[14] RichardPStanley. Enumerativecombinatorics. Number49.Cambridge
aggregation. The aggregate in this case corresponds to an
university press,2011.
ancestral genome. The most common used rearrangement [15] Glenn Tesler. Efficient algorithms for multichromosomal genome
distance between two nuclear genomes is based on rever- rearrangements. JournalofComputer andSystem Sciences, 65(3):587–
609,2002.
sals [15], but mitochondrialDNA rearrangementstudies have
[16] Anke Van Zuylen and David P Williamson. Deterministic algorithms
also involved the Kendall τ distance [6]. In the latter case, for rank aggregation and other ranking and clustering problems. In
the authors only considered the median problem C = 1, InternationalWorkshoponApproximationandOnlineAlgorithms,pages
260–273.Springer, 2007.
although the min-max problem is equally relevant [8], [3].