Table Of ContentOptimal Anytime Constrained Simulated
?
Annealing for Constrained Global Optimization
Benjamin W. Wah and Yi Xin Chen
Departmentof Electrical andComputer Engineering
and the Coordinated Science Laboratory
Universityof Illinois, Urbana-Champaign
1308 West Main Street
Urbana,IL 61801, USA
fwah, cheng@manip.crhc.uiuc.edu
URL: http://www.manip.crhc.uiuc.edu
Abstract. Inthispaperweproposeanoptimalanytimeversionofcon-
strained simulated annealing (CSA) for solving constrained nonlinear
programming problems (NLPs). One of the goals of the algorithm is
to generate feasible solutions of certain prescribed quality using an av-
erage timeof thesameorder of magnitudeas thatspentby theoriginal
CSA with an optimal cooling schedule in generating a solution of sim-
ilar quality. Here, an optimal cooling schedule is one that leads to the
shortestaveragetotalnumberofprobeswhentheoriginal CSAwiththe
optimal scheduleis runmultipletimesuntilit(cid:12)ndsa solution. Our sec-
ondgoalistodesignananytimeversionofCSAthatgeneratesgradually
improving feasible solutions as more time is spent, eventually (cid:12)nding a
constrained global minimum (CGM). In our study, we have observed a
monotonically non-decreasing function relating the success probability
ofobtainingasolutionandtheaveragecompletiontimeofCSA,andan
exponential function relating the objective target that CSA is looking
for and the average completion time. Based on these observations, we
have designed CSAAT(cid:0)ID, the anytime CSA with iterative deepening
that schedules multiple runs of CSA using a set of increasing cooling
schedules and a set of improving objective targets. We then prove the
optimality of our schedules and demonstrate experimentally the results
on four continuous constrained NLPs. CSAAT(cid:0)ID can be generalized
to solving discrete, continuous, and mixed-integer NLPs, since CSA is
applicable to solve problems in these three classes. Our approach can
alsobegeneralizedtootherstochasticsearchalgorithms,suchasgenetic
algorithms, and be used to determine the optimal time for each run of
suchalgorithms.
1 Introduction
Alargevarietyofengineeringapplicationscanbeformulatedasconstrainednon-
linear programming problems (NLPs). Examples include production planning,
?
Proc. Sixth International Conference on Principles and Practice of Constraint Pro-
gramming,Springer-Verlag, Sept. 2000
2 Benjamin W. Wah andYi XinChen
computer integrated manufacturing, chemical control processing, and structure
optimization.Someapplicationsthatareinherentlyconstrainedorhavemultiple
objectivesmaybeformulatedasunconstrainedmathematicalprogramsduetoa
lackofgoodsolutionmethods. Examplesincludeapplicationsin neural-network
learning, computer-aided design for VLSI, and digital signal processing. High-
quality solutions to these applications are important because they may lead to
lower implementation and maintenance costs.
By (cid:12)rst transforming multi-objective NLPs into single-objective NLPs, all
constrained NLPs can be considered as single-objective NLPs. Without loss of
generality, we consider only minimization problems in this paper. A general
discrete constrained NLP is formulated as follows:
minimize f(x)
subject to g(x)(cid:20)0 x=(x1;x2;:::;xn) is a vector (1)
h(x)=0 of discrete variables;
T
where f(x) is a lower-bounded objective function, h(x) = [h1(x);(cid:1)(cid:1)(cid:1);hm(x)]
is a set of m equality constraints, and all the discrete variables in x are (cid:12)nite.
Both f(x) and h(x) can be either linear or nonlinear, continuous or discrete
(i.e. discontinuous), and analytic in closed forms or procedural. In particular,
we are interested in application problems whose f(x), g(x), and h(x) are non-
di(cid:11)erentiable. Our general formulation includes both equality and inequality
constraints, although it is shown later that inequality constraints can be trans-
formed into equality constraints. The search space (sometimes called solution
space) X is the (cid:12)nite set of all possible combinations of discrete variables in x
that may or may not satisfy the constraints. Such a space is usually limited by
some bounds on the range of variables.
To characterize the solutions sought in discrete space, we de(cid:12)ne for discrete
problems, N(x), the neighborhood [1] of point x in discrete space X, as a (cid:12)nite
0 0
user-de(cid:12)ned set of points fx 2Xgsuch that x is reachablefrom x in one step,
0 0 00
that x 2 N(x) () x 2 N(x), and that it is possible to reach every other x
starting from anyx in one or more steps through neighboringpoints. Note that
neighboring points may be feasible or infeasible.
Point x 2 X is called a discrete constrained local minimum (CLM) if it
satis(cid:12)es two conditions: a) x is a feasible point, implying that x satis(cid:12)es all the
0 0
constraintsg(x)(cid:20)0andh(x)=0,and b) f(x)(cid:20)f(x), forall x 2N(x) where
0
x is feasible. Aspecialcasein which xisaCLMis when x isfeasibleandall its
neighboring points are infeasible.
Point x 2 X is called a constrained global minimum (CGM) i(cid:11) a) x is a
0 0
feasible point, and b) for every feasible point x 2 X, f(x) (cid:21) f(x). According
to our de(cid:12)nitions, a CGM must also be a CLM.
In the next section we formulate the problem that we study in this paper.
Thisisfollowedbyasummaryoftheconstrainedsimulatedannealingalgorithm
(CSA) in Section 3 and a statistical model on the CSA procedure in Section 4.
Finally, we present our proposed anytime CSA with iterative deepening in Sec-
tion 5 and our experimental results in Section 6.
OptimalAnytimeCSA for Constrained Global Optimization 3
2 Formulation of the Problem
Constrained simulated annealing (CSA) [14] (see Section 3) has been proposed
asapowerfulglobalminimizationalgorithmthatcanguaranteeasymptoticcon-
vergence to a CGM with probability one when applied to solve (1).
One of the di(cid:14)culties in using CSA, like conventional simulated annealing
(SA)[8],istodetermineanannealingschedule,orthewaythattemperaturesare
decreasedinordertoallowasolutionofprescribedqualitytobefoundquickly.In
general,theasymptoticconvergenceofCSAtoaCGMwithprobabilityonewas
provedwithrespecttoacoolingscheduleinwhichtemperaturesaredecreasedin
alogarithmicfashion[14],basedontheoriginalnecessaryandsu(cid:14)cientcondition
of Hajek developed for SA [6]. It requires an in(cid:12)nitely long cooling schedule in
order to approach a CGM with probability one.
In practice, asymptotic convergence can never be exploited since any algo-
rithm must terminate in (cid:12)nite time. There are two ways to complete CSA in
(cid:12)nite time. The (cid:12)rst approach uses an in(cid:12)nitely long logarithmically decreas-
ing cooling schedule but terminates CSA in (cid:12)nite time. This is not desirable
because CSA will most likely not have converged to any feasible solution when
terminated at high temperatures.
Thesecondapproachistodesignacoolingschedulethatcancompleteinpre-
scribed(cid:12)nitetime. Inthispaperweusethefollowinggeometric cooling schedule
with cooling rate (cid:11):
Tj+1 =(cid:11)(cid:2)Tj; j =0;(cid:1)(cid:1)(cid:1);N(cid:11)(cid:0)1; (2)
where (cid:11) < 1, j measures the number of probes in CSA (assuming one probe is
made at each temperature and all probes are independent), and N(cid:11) is the total
numberofprobesintheschedule.Aprobehereisaneighboringpointexamined
by CSA, independent of whether CSA accepts it or not. We use the number of
probes expended to measure overhead because it is closely related to execution
time. Given T0 >TN(cid:11) >0 and (cid:11), we can determine N(cid:11), the length of a cooling
schedule, as:
TN(cid:11)
N(cid:11) =log(cid:11) : (3)
T0
Note that the actual numberof probesin a successfulrun maybe less than N(cid:11),
asarunisterminatedassoonasadesirablesolutionisfound.However,itshould
beveryclosetoN(cid:11),assolutionsaregenerallyfoundwhentemperaturesarelow.
The e(cid:11)ect of using a (cid:12)nite (cid:11) is that CSA will converge to a CGM with
probability less than one. When CSA uses a (cid:12)nite cooling schedule N(cid:11), we are
interested in its reachability probability PR(N(cid:11)), or the probability that it will
(cid:12)ndaCGMinanyofitspreviousprobeswhenitstops.Letpj betheprobability
th
that CSA (cid:12)nds a CGM in its j probe, then PR(N(cid:11)) when it stops is:
N(cid:11)
PR(N(cid:11))=1(cid:0) (1(cid:0)pj): (4)
j=1
Y
4 Benjamin W. Wah andYi XinChen
Table 1. An example illustrating trade-o(cid:11)s between the expected total number of
probesinmultiplerunsof CSAto(cid:12)ndaCGM, thecooling rate usedineach run,and
the probability of success in each run. The optimal cooling rate at (cid:11)=0:574 leads to
theminimumaveragetotalnumberofprobesto(cid:12)ndaCGM.Notethattheprobability
of success is not the highest in one run using the optimal cooling rate. (The problem
0
solved is de(cid:12)ned in (6). Each cooling scheduleis run200 timesusing f =200.)
(cid:11) cooling rate in one run 0.139 0.281 0.429 0.574 0.701 0.862 0.961 0.990
N(cid:11) avg. cooling schedule 99.8 148.0 207.5 296.0434.5 798.0 2414.0 6963.5
T(cid:11) avg. CPU timeper run 0.026 0.036 0.050 0.074 0.11 0.18 0.54 1.58
PR(N(cid:11)) succ. prob. of one run 1% 10% 25% 40% 55% 70% 85% 95%
1
PR(N(cid:11)) avg. runs to (cid:12)ndsol’n 100 10 4 2.5 1.82 1.43 1.18 1.05
N(cid:11)
PR(N(cid:11)) avg. probes to (cid:12)nd sol’n 9980 1480 830 740 790 1140 2840 7330
T(cid:11)
PR(N(cid:11)) avg. timeto(cid:12)ndsol’n 2.6 0.36 0.20 0.19 0.20 0.25 0.64 1.7
Reachability can be maintained by keeping the best solution found at any time
and by reporting the best solution when CSA stops.
Although the exact value of PR(N(cid:11)) is hard to estimate and control,we can
always improve the chance of hitting a CGM by running CSA multiple times,
each using a (cid:12)nite cooling schedule. Given PR(N(cid:11)) for each run of CSA and
that all runs are independent, the expected number of runs to (cid:12)nd a solution is
1
PR(N(cid:11)) and the expected total number of probes is:
1
Expected total number of j(cid:0)1 N(cid:11)
= PR(N(cid:11))(1(cid:0)PR(N(cid:11))) N(cid:11)j = (5)
probes to (cid:12)nd a CGM PR(N(cid:11))
j=1
X
Table1illustratestrade-o(cid:11)sbetweenN(cid:11)andPR(N(cid:11))insolvingaconstrained
NLP with a 10-dimensionalRastrigin function as its objective:
n
2
minimize f(x)=F 10n+ (xi (cid:0)10cos(2(cid:25)xi));200 (6)
i=1 !
X
subject to j(xi(cid:0)4:2)(xi+3:2)j(cid:20)0:1 for n=10;
where F is the transformation function de(cid:12)ned later in (11). A run of CSA is
successfulifit(cid:12)ndsafeasiblepointwithobjectivevaluelessthanorequalto200
in this run, and the probabilityto hit aCGM is calculated by the percentageof
successful runs over200 independent runs.
Table 1 shows that PR(N(cid:11)) increases towards one when (cid:11) is increased. A
long cooling schedule is generally undesirable because the expected number of
probes in (5) is large, even though the success probability in one run of CSA
approachesone.Ontheotherhand,ifthescheduleistooshort,then thesuccess
probability in one run of CSA is low, leading to a large expected number of
probes in (5). An optimal schedule is one in which CSA is run multiple times
and the expected total number of problems in (5) is the smallest.
De(cid:12)nition 1. An optimal cooling schedule is one that leads to the smallest av-
erage total number of probes of multiple runs of CSA in order to (cid:12)nd a solution
of prescribed quality.
OptimalAnytimeCSA for Constrained Global Optimization 5
N(cid:11)
Table1showsthat PR(N(cid:11)) isaconvexfunctionwithaminimumat(cid:11)=0:574.
That is, the average total number of probes of multiple runs of CSA to (cid:12)nd a
CGM (cid:12)rst decreases and then increases, leading to an optimal cooling rate of
0.574 and an averageof 2.5 runs of CSA to (cid:12)nd a CGM.
This paper aims at determining an optimal cooling schedule that allows a
solutionofprescribedqualitytobefoundintheshortestaverageamountoftime.
Inorderto(cid:12)ndtheoptimalcoolingschedule,usersgenerallyhavetoexperiment
by trial and error until a suitable schedule is found. Such tuning is obviously
not practical in solving large complex problems. In that case, one is interested
in running a single versionof the algorithm that can adjust its cooling schedule
dynamically in order to (cid:12)nd a schedule close to the optimal one. Moreover,
one is interested in obtaining improved solutions as more time is spent on the
algorithm.Such an algorithmis an anytime algorithm because it alwaysreports
the best solution found if the search were stopped at any time.
Thegoalsofthispaperaretwofolds.First,weliketodesigncoolingschedules
for CSA in such a ways that the average time spent in generating a solution of
certain quality is of the same order of magnitude as that of multiple run of the
original CSA with an optimal cooling schedule. In other words, the new CSA is
optimal in terms of average completion time up to an order of magnitude with
respect to that of the original CSA with the best cooling schedule. Second, we
like to design a set of objective targets that allow an anytime-CSA to generate
improved solutions as more time is spent, eventually (cid:12)nding a CGM.
The approach we take in this paper is to (cid:12)rst study statistically the per-
formance of CSA. Based on the statistics collected, we propose an exponential
model relating the value of objective targets sought by CSA and the average
execution time, and a monotonically non-decreasing model relating the success
probabilityofobtainingasolutionandtheaverageexecutiontime.Thesemodels
lead to the design of CSAAT(cid:0)ID, the anytime CSA with iterative deepening,
that schedules multiple runs of CSA using a set of increasing cooling schedules
that exploit the convexity of (5) and a set of improving objective targets.
Let Topt(fi) be the averagetime taken by the original CSA with an optimal
cooling schedule to (cid:12)nd a CLM of value fi or better, and TAT(cid:0)ID(fi) be the
average time taken by CSAAT(cid:0)ID to (cid:12)nd a CLM of similar quality. Based on
the principle of iterative deepening [9], we prove the optimality of CSAAT(cid:0)ID
by showing:
TAT(cid:0)ID(fi)=O(Topt(fi)) where i=0;1;2;(cid:1)(cid:1)(cid:1) (7)
(cid:3)
Further,CSAAT(cid:0)ID returnssolutionsofvaluesf0 >(cid:1)(cid:1)(cid:1)>f thataregradually
improving with time.
ThereweremanypaststudiesonannealingschedulesinSA. Schedules stud-
ied include logarithmic annealing schedules [6] that are necessary and su(cid:14)cient
for asymptotic convergence, schedules inversely proportional to annealing steps
in FSA [13] that areslowwhen the annealingstep is large,simulated quenching
scheduling inASA [7]that isnot e(cid:14)cientwhen thenumber ofvariablesislarge,
proportional (or geometric) cooling schedules [8] using a cooling rate between
6 Benjamin W. Wah andYi XinChen
0.8-0.99oraratecomputedfromtheinitialand(cid:12)naltemperatures[11],constant
annealing [3], arithmetic annealing [12], polynomial-time cooling [2] adaptive
temperature scheduling based on the acceptance ratio of bad moves [16], and
non-equilibrium SA (NESA) [4] that operates at a non-equilibrium condition
and that reduces temperatures as soon as improved solutions are found.
All the past studies aimed at designing annealing schedules that allow one
run of SA to succeed in getting a desirable solution. There was no prior studies
that examine trade-o(cid:11)s between multiple runs of SA using di(cid:11)erent schedules
and the improved probability of getting a solution. Our approach in this paper
isbasedonmultiple runsofCSA,whoseexecutiontimesincreaseinageometric
fashionandwhoselastrun(cid:12)ndsasolutiontotheapplicationproblem.Basedon
iterative deepening [9], the total time of all the runs will be dominated by the
last run and will only be a constant factor of the time taken in the last run.
3 Constrained Simulated Annealing
Inthissection,wesummarizeourLagrange-multipliertheoryforsolvingdiscrete
constrained NLPs and the adaptation of SA to look for discrete saddle points.
Consider a discrete equality-constrained NLP:
minimizex f(x) (8)
subject to h(x)=0;
where x = (x1;:::;xn) is a vector of discrete variables, and f(x) and h(x)
are analytic in closed forms (but not necessarily di(cid:11)erentiable) or procedural.
An inequality constraint like gj(x) (cid:20) 0 can be transformed into an equivalent
equality constraint max(gj(x);0) = 0. Hence, without loss of generality, our
theory only considers application problems with equality constraints.
A generalized discrete Lagrangian function of (8) is de(cid:12)ned as follows:
T
Ld(x;(cid:21))=f(x)+(cid:21) H(h(x)); (9)
where H is a continuous transformation function satisfying H(y)=0 i(cid:11) y =0.
(cid:3) (cid:3)
We de(cid:12)ne a discrete saddle point (x ;(cid:21) ) with the following property:
(cid:3) (cid:3) (cid:3) (cid:3)
Ld(x ;(cid:21))(cid:20)Ld(x ;(cid:21) )(cid:20)Ld(x;(cid:21) ) (10)
(cid:3)
for all x 2 N(x ) and all (cid:21) 2 R. Essentially, a saddle point is one in which
(cid:3)
Ld(x ;(cid:21)) is at a local maximum in the (cid:21) subspace and at a local minimum
in the x subspace. The concept of saddle points is very important in discrete
problems because, starting from them, we can derive the (cid:12)rst-order necessary
and su(cid:14)cient condition for CLM that lead to global minimization procedures.
This is stated formally in the following theorem [15]:
Theorem 1. First-order necessary and su(cid:14)cient condition for CLM. A point
in the variable space of (8) is a CLM if and only if it satis(cid:12)es the saddle-point
condition (10).
OptimalAnytimeCSA for Constrained Global Optimization 7
1. procedure CSA
2. set initial x=(x;(cid:21)) by randomlygenerating xandbysetting (cid:21) 0;
3. initialize temperatureT0 tobe large enough and cooling rate 0<(cid:11)<1
4. set NT (numberof probes pertemperature);
5. while stoppingcondition is notsatis(cid:12)ed do
6. for n 1 to NT do
0 0
7. generate x fromN(x)using G(x;x);
0 0
8. accept x with probability AT(x;x)
9. end for
10. reduce temperature byT (cid:11)(cid:2)T;
11. end while
12.end procedure
Fig.1. CSA:Constrained simulatedannealing [15].
Figure 1 describes CSA [14] that looks for saddle points with the minimum
objective value. By carrying out probabilistic ascents in the (cid:21) subspace with a
probabilityofacceptancegovernedbyatemperature,itlooksforlocalmaximain
thatsubspace.Likewise,bycarryingoutprobabilistic descentsinthexsubspace,
it looksforlocalminimain thatsubspace.It canbe shownthatthe pointwhere
the algorithm stops is a saddle point in the Lagrangianspace.
CSAdi(cid:11)ers fromtraditionalSAthat onlyhasprobabilistic descentsin the x
space,andthepointwhereSAstopsisalocalminimumoftheobjectivefunction
of anunconstrained optimization. By extending the searchto saddlepoints in a
Lagrangian space, CSA allows constrained optimization problems to be solved
in a similar way as SA in solving unconstrained optimization problems.
0 0
UsingdistributionG(x;x)togeneratetrialpointx inneighborhoodN(x),a
0
MetropolisacceptanceprobabilityAT(x;x),andalogarithmiccoolingschedule,
CSA hasbeen provento haveasymptotic convergencewith probabilityone to a
CGM. This is stated in the following theorem without proof [14].
Theorem 2. Asymptotic convergence of CSA.TheMarkovchainmodelingCSA
converges to a CGM with probability one.
AlthoughTheorems1and2werederivedfordiscreteconstrainedNLPs,itis
applicable to continuous and mixed-integer constrained NLPs if all continuous
variables were (cid:12)rst discretized. Discretization is acceptable in practice because
numerical evaluations of continuous variables using digital computers can be
consideredasdiscreteapproximationoftheoriginalvariablesuptoacomputer’s
precision.Intuitively,ifdiscretizationis(cid:12)neenough,thesolutionsfoundarefairly
good approximations to the original solutions. Due to space limitations, we do
not discuss the accuracy of solutions found in discretized problems [17]. In the
rest of this paper, we apply CSA to solve constrained NLPs, assuming that
continuousvariablesin continuousandmixed-integerNLPsare(cid:12)rstdiscretized.
8 Benjamin W. Wah andYi XinChen
4 Performance Modeling of CSA
The performanceof aCSA procedure to solvea givenapplicationproblemfrom
a random starting point can be measured by the probability that it will (cid:12)nd a
solution of a prescribed quality when it stops and the average time it takes to
(cid:12)ndthesolution.Therearemanyparametersthatwilla(cid:11)ecthowCSAperforms,
such as neighborhood size, generation probability, probability of accepting a
pointgenerated,initialtemperature,coolingschedule,andrelaxationofobjective
function. In this section, we focus on the relationship among objective targets,
cooling schedules, and probabilities of (cid:12)nding a desirable solution.
4.1 Relaxation of objective target
One way to improve the chance of (cid:12)nding a solution by CSA is to look for
CLM instead of CGM. An approach to achieve this is stop CSA whenever it
(cid:12)nds a CLM of a prescribed quality. This approach is not desirable in general
because CSA may only (cid:12)nd a CLM when its temperatures are low, leading to
littledi(cid:11)erenceintimesbetween(cid:12)ndingCLMandCGM.Further,itisnecessary
to provethe asymptotic convergenceof the relaxed CSA procedure.
A second approach that we adopt in this paper is to modify the constrained
0
NLP in such a way that a CLM of value smaller than f in the original NLP is
consideredaCGMintherelaxedNLP.SincetheCSAprocedureisunchanged,its
asymptoticconvergencebehaviorremainsthesame.TherelaxedNLPisobtained
by transforming the objective target of the original NLP:
0 0
0 f if f(x)(cid:20)f
F(f(x);f )= 0 (11)
f(x) if f(x)>f :
(cid:26)
(cid:3)
Assumingthatf isthevalueoftheCGMintheoriginalNLP,itfollowsthat
(cid:3) 0 (cid:3) 0 0 (cid:3)
the value of the CGM of the relaxed NLP is f if f (cid:20) f and is f if f > f .
Moreover, since the relaxed problem is a valid NLP solvable by CSA, CSA will
convergeasymptotically to a CGM of the relaxed NLP with probability one.
As a relaxed objective function leads to a possibly larger pool of solution
points, we expect CSA to have a higher chance of hitting one of these points
during its search. This property will be exploited in CSAAT(cid:0)ID in Section 5.2.
0
4.2 Exponential model relating f and N(cid:11) for (cid:12)xed PR(N(cid:11))
In order to develop CSAAT(cid:0)ID that dynamically controls its objective targets,
0
weneed toknowthe relationshipbetweenf ,the degreeofobjectiverelaxation,
and N(cid:11), the number of probes in one run of CSA, for a (cid:12)xed PR(N(cid:11)). In this
sectionwe(cid:12)ndthisrelationshipbystudyingthestatisticalbehaviorinevaluating
four continuous NLPs by CSA.
Figure 2 shows a 3-D graph relating the parameters in solving (6), in which
PR(N(cid:11)) was obtained by running CSA 200 times for each combination of N(cid:11)
0 0
and f . It shows an exponentially decreasingrelationship between f and N(cid:11) at
OptimalAnytimeCSA for Constrained Global Optimization 9
Trace of one run of anytime-CSA
f’
220
200
180
160
1.0
0.8
7 8 9 10 11 12 13 0 0.20.40P.6R(Nα)
log2(Nα)
0
Fig.2. A 3-D graph showing an exponentially decreasing relationship between f
and N(cid:11) and a monotonically non-decreasing relationship between PR(N(cid:11)) and N(cid:11)
when CSA is applied to solve (6). The dotted line shows the trace taken in a run of
CSAAT(cid:0)ID.
2
Table 2. The averages and standard deviations of coe(cid:14)cient of determination R on
0
linear (cid:12)tsof f andlog2(N(cid:11)) for (cid:12)xedPR(N(cid:11)).
2 2
Benchmark Mean(R ) Std.Dev.(R )
G1 [10] 0.9389 0.0384
G2 [10] 0.9532 0.0091
Rastrigin 0.9474 0.0397
Problem 5.2 [5] 0.9461 0.0342
(cid:12)xedPR(N(cid:11))andamonotonicallynon-decreasingrelationshipbetweenPR(N(cid:11))
0
and N(cid:11) at (cid:12)xed f . These observationslead to the followingexponential model:
0
(cid:0)af
N(cid:11) =ke for (cid:12)xed PR(N(cid:11)) and positive real constants a and k: (12)
To verify statistically our proposed model, we performed experiments on
several benchmarks of di(cid:11)erent complexities: G1, G2 [10], Rastrigin (6), and
Floudas andPardalos’Problem5.2[5].Foreachproblem,wecollected statistics
0 0
onf andN(cid:11) atvariousPR(N(cid:11)), regressedalinearfunctionon f andlog2(N(cid:11))
2
to (cid:12)nd a best (cid:12)t, and calculated the coe(cid:14)cient of determination R of the (cid:12)t.
2
Table 2 summarizes the average and standard deviation of R of the linear (cid:12)t
2 2
for each test problem, where R very close to 1 shows a good (cid:12)t. Since R has
0
averagesverycloseto oneandhassmallstandarddeviations, f is veri(cid:12)edto be
exponential with respect to N(cid:11) at (cid:12)xed PR(N(cid:11)).
4.3 Su(cid:14)cient conditions for the existence of N(cid:11)opt
0 N(cid:11)
In order for N(cid:11)opt to exist at (cid:12)xed f , PR(N(cid:11)) in (5) must have an absolute
minimum in (0;1). Such a minimum exists if PR(N(cid:11)) satis(cid:12)es the 0f0ollowing
su(cid:14)cientconditions:a)PR(0)=0andlimN(cid:11)!1PR(N(cid:11))=1,andb)PR(0)>0.
We do not show the proof of these conditions due to space limitation.
10 Benjamin W. Wah andYi XinChen
100 14000
12000
80
)α 60 N)α10000
P (NR 40 /P (αR 8000
N 6000
20
4000
0 2000
0 2000 4000 6000 8000 0 2000 4000 6000 8000
Nα Nα
a) PR(N(cid:11))satis(cid:12)es thetwo su(cid:14)cient conditions b)Absolute minimumin PRN(N(cid:11)(cid:11))
N(cid:11)
Fig.3. An example showing the existence of an absolute minimum in PR(N(cid:11)) when
0
CSAwas applied to solve (6) with f =180. (N(cid:11)opt (cid:25)2000.)
0
We collected statistics on PR(N(cid:11)) and N(cid:11) at various f for each of the four
test problems studied in Section 4.2. The results indicate that PR(N(cid:11)) satis(cid:12)es
N(cid:11)
the twosu(cid:14)cient conditions,implying that PR(N(cid:11)) hasan absoluteminimum in
(0;1). In other words, each of these problems has an optimal cooling schedule
N(cid:11) 0
N(cid:11)opt that minimizes PR(N(cid:11)) at (cid:12)xed f . Figure 3 illustrates the existence of
0
such an optimal schedule in applying CSA to solve (6) with f = 180. The
experimental results also show that PR(N(cid:11)) is monotonically nondecreasing.
Note that there is an exponential relationship between PR(N(cid:11)) and N(cid:11) in
part of the range of PR(N(cid:11)) (say between 0.2 and 0.8) in the problems tested.
We do not exploit this relationship because it is not required by the iterative
deepening strategy studied in the next section. Further, the relationship is not
satis(cid:12)ed when PR(N(cid:11)) approaches0 or 1.
Itisinterestingtopointoutthatthesecondsu(cid:14)cientconditionisnotsatis(cid:12)ed
1 N(cid:11)
wh00ensearchingwithrandomprobing.Inthiscase,PR(N(cid:11))=1(cid:0)(1(cid:0)S) ,and
2 1
PR(0)=(cid:0)ln (1(cid:0) S)<0, where S is the number of states in the search space.
N(cid:11) 0
Hence, PR(N(cid:11)) at (cid:12)xed f does not have an absolute minimum of N(cid:11) in (0;1).
5 Anytime CSA with Iterative Deepening
We propose in this section CSAAT(cid:0)ID with two components. In the (cid:12)rst com-
ponent discussedinSection5.1,wedesignaset ofcoolingschedulesformultiple
runsoftheoriginalCSAsothat(7)issatis(cid:12)ed;thatis,theaveragetotalnumber
0
ofprobesto(cid:12)ndaCLMofvaluef orbetterisofthesameorderofmagnitudeas
0
Topt(f ).InthesecondcomponentpresentedinSection5.2,wedesignaschedule
0 (cid:3)
to decrease objective target f in CSAAT(cid:0)ID that allows it to (cid:12)nd f using an
(cid:3)
averagetotal number of probes of the same order of magnitude as Topt(f ).
CSAAT(cid:0)ID in Figure 4 (cid:12)rst (cid:12)nds low-quality feasible solutions in relatively
small amountsoftime. Itthen tightens itsrequirementgradually,triesto(cid:12)nd a
solution at each quality level, and outputs the best solution when it stops.
Description:objectives may be formulated as unconstrained mathematical programs due to a constrained NLPs can be considered as single-objective NLPs.