A Reference Vector based Many-Objective Evolutionary Algorithm with Feasibility-aware Adaptation

  • 2019-04-12 16:12:46
  • Mingde Zhao, Hongwei Ge, Kai Zhang, Yaqing Hou
  • 1

Abstract

The infeasible parts of the objective space in difficult many-objectiveoptimization problems cause trouble for evolutionary algorithms. This paperproposes a reference vector based algorithm which uses two interacting enginesto adapt the reference vectors and to evolve the population towards the truePareto Front (PF) s.t. the reference vectors are always evenly distributedwithin the current PF to provide appropriate guidance for selection. Thecurrent PF is tracked by maintaining an archive of undominated individuals, andadaptation of reference vectors is conducted with the help of another archivethat contains layers of reference vectors corresponding to different density.Experimental results show the expected characteristics and competitiveperformance of the proposed algorithm TEEA.

 

Quick Read (beta)

A Reference Vector based Many-Objective Evolutionary Algorithm with Feasibility-aware Adaptation

Mingde Zhao Hongwei Ge [email protected] Kai Zhang Yaqing Hou College of Computer Science and Technology, Dalian University of Technology, China Montréal Institute of Learning Algorithms (Mila), Canada School of Computer Science, McGill University, Canada Department of Computer Science and Engineering, Washington University in St. Louis, USA School of Computer Science and Engineering, Nanyang Technological University, Singapore
Abstract

The infeasible parts of the objective space in difficult many-objective optimization problems cause trouble for evolutionary algorithms. This paper proposes a reference vector based algorithm which uses two interacting engines to adapt the reference vectors and to evolve the population towards the true Pareto Front (PF) s.t. the reference vectors are always evenly distributed within the current PF to provide appropriate guidance for selection. The current PF is tracked by maintaining an archive of undominated individuals, and adaptation of reference vectors is conducted with the help of another archive that contains layers of reference vectors corresponding to different density. Experimental results show the expected characteristics and competitive performance of the proposed algorithm TEEA.

keywords:
many-objective optimization, reference vector, feasible objective space
journal: Applied Soft Computing

1 Introduction

Many-objective Optimization Problems (MaOPs) are one of the biggest open-problems in applied soft computing. The complexities of the real-world problems give rise to the class of heuristic algorithms with population features, which are often recognized as Many-Objective Evolutionary Algorithms (MaOEAs) Louafi et al. (2017); Li et al. (2017); Ferreira et al. (2017); Du et al. (2018).

The difficulty of MaOPs increase dramatically with the increment of the dimensionality of the objective space, i.e. the number of objectives Purshouse and Fleming (2007); Wang et al. (2017). The deterioration in performance inspires new heuristics. Recent literature reveals the trend of combining different strategies to achieve synergistic performance, since sticking to one direction alone leads to the overfitting on problems of certain types and less robustness.

In this paper, we propose a reference vector based algorithm, yet hybridizing the ideas from Pareto dominance relations. The proposed algorithm TEEA focuses on the scenario of the objective space being haunted by infeasible parts that change the distribution of the PF while it is evolving, which is particularly meaningful for difficult problems where proximity of population to the true PF should not be expected. TEEA is with a selection engine and an adaptation engine, capable of adapting to the various characteristics of the Feasible Objective Space (FOS) in-time, which can be seen as a generalization of the adaptive reference vector based algorithms aiming to adapt the reference vectors for the true PF only.

The rest of this paper is organized as follows: Section 2 gives the preliminaries of researches on the MaOPs including the basics of reference vector based MaOPs, the focus and goal of this paper, the related works and then provides the ideas that yield the contribution of this work. Section 3 gives the details of the proposed algorithm TEEA. Section 4 presents the experimental studies, which include the investigations of the characteristics and the comparative results with other state-of-the-art algorithms on a standard benchmark suite.

2 Preliminaries

Reference vector based approaches are one of the most popular families of MaOEAs for their many attractive properties on problems with modest number of objectives (e.g. M8): applicability onto the complicated scenarios Chugh et al. (2018), human preference articulation Cheng et al. (2016) and high efficiency, etc.. In this section, we will first focus on giving the basic knowledge about the problem that we want to address in this paper, then discuss the related works, as well as our ideas of motivation and inspiration.

2.1 Basics & Challenge of Infeasibility

The goal for MaOEAs is to find a solution set whose objective value vectors in the objective space constitute a “good” representative set of the true PF. To be a good representative set, the finite individuals11 1 we overload the term “individual” to call one solution in the solution space and the corresponding objective vector in the objective space. in the set should well present the spatial properties of the infinite true PF hypersurface. With such recognition, researchers summarize the goal of MaOEAs into obtaining proximity and diversity simultaneously, where proximity ensures that the individuals are close to the true PF and diversity ensures that the limited individuals can well represent the distribution of the true PF.

Without the loss of generality, assuming that the goal of each objective is minimization and the objectives are strictly positive, here we introduce some basic knowledge about reference vector based MaOEAs, which is to be used in the later sections.

Definition 2.1 (Feasibility)

Given a many-objective function 𝐟:SR+M, where SRD is the solution space defined for the problem (domain of f), D is the dimensionality of search space and M is the dimensionality of the objective space, the feasible objective space is defined as

𝒪{𝒐|𝒐=𝒇(𝒙),𝒙𝒮}+M

Also, an objective vector 𝐨 is said to be infeasible if 𝐨O.

Definition 2.2 (Pareto-dominate)

Given an MaOP and two individuals 𝐚,𝐛R+M, 𝐚 is said to Pareto-dominate (dominate) 𝐛 if and only if 𝐚 is less than or equal to 𝐛 element-wise and with at least one element strictly less than that of 𝐛, i.e.

𝒂<𝒃 𝒊𝒇𝒇 i{1,,M},aibi and j{1,,M},aj<bj
Definition 2.3 (Pareto Front)

Given the many-objective function 𝐟:SO to be minimized and the set P+ of individuals ever found in the optimization process, the Pareto Front (PF, current PF) PF(P+) is a subset of P+ that dominates all other elements in P+, i.e.

𝒐PF(P+) and 𝒑P+-PF(P+),𝒐<𝒑

People also overload the term “current PF” for the imaginary hypersurface that the current PF locates on.

Definition 2.4 (True Pareto Front)

Given the many-objective function 𝐟:SO to be minimized, the true Pareto Front (true PF) truePF(𝐟) is a subset of O that dominates all other elements in O, i.e.

truePF(𝒇)𝒪,𝒐PFtruePF(𝒇) and 𝒐𝒪-truePF(𝒇),𝒐PF<𝒐
Definition 2.5 (Infeasibility of FOS, Full FOS & Partial FOS)

Given MaOP 𝐟:SO and truePF(𝐟), the trivial infeasible part Itrivial of the objective space is defined as the set of points in R+M that could dominate at least one point in the true PF, i.e.

Itrivial𝒐|𝒐PFtruePF(𝒇),𝒐<𝒐PF

Additionally, if RM=Itrivial+O, the problem 𝐟 is said to have a full feasible objective space (full FOS); Else, the problem is said to have non-trivial infeasible parts in the objective space or simply a partial feasible objective space (partial FOS).

Problems with full FOS are also recognized as the simplex-like problems Cheng et al. (2016); Tian et al. (2017). A full FOS occupies the space of +M except for the trivial infeasible part where points dominate the true PF. A partial FOS has more infeasible parts where points are dominated by the true PF.

Definition 2.6 (Reference Vectors & Reference Points)

Given the many-objective function 𝐟:SO, a set Z of reference vectors is a finite subset of R+M.

For any reference vector 𝐳, the unit L1-norm vector 𝐳^𝐳/𝐳1 is said to be the corresponding reference point.

Reference vectors and reference points have injective correspondence. We call the scaled reference vectors reference points for they locate on the subspace of the unit hyperplane with L1-norm 1.

Definition 2.7 (Activation of Reference Vector, Association & Adjacent Space)

Given a set of reference vectors ZRM and an individual 𝐨inO, a reference vector 𝐳Z is said to be activated by 𝐨 if

(𝒐,𝒛)=argmin𝒛Z(𝒐,𝒛)

where (𝐨,𝐳)arccos(𝐨T𝐳𝐨2𝐳2) is the angle between 𝐨 and 𝐳. Also, 𝐨 is said to be attached or associated to 𝐳 if 𝐳 is activated by 𝐨. The set of all the possible points that can activate 𝐳 given Z is recognized as the adjacent space of 𝐳 given Z.

If a reference vector is activated, there must be individuals nearby the direction the reference vector points towards. An MaOP with full FOS is expected to activate all reference vectors if they are roughly evenly distributed towards every direction in +M.

Definition 2.8 (Infeasibility of PF, Full PF & Partial PF)

Given MaOP 𝐟:SO and truePF(𝐟), the MaOP is said to have a fully-covering PF (or full PF) if

𝒐+M,𝒐PFtruePF(𝒇) and c,𝒐=c𝒐PF

Else, the problem is said to have a partially-covering PF (or partial PF) .

Theorem 2.1

MaOPs with partial true PFs must have partial FOS.

Proof 2.1

Partial PFs cannot dominate all points in a FOS with only the trivial infeasible part. Thus the FOS must have a non-trivial infeasible part.

An MaOP with full FOS must have full PF, but the converse is not true.

Reference vectors are often used as the guidelines for evolutionary selection. The individuals associated to a reference vector will be selected using some criterion that penalizes the deviation from the reference vector’s direction for diversity and simultaneously awards the individuals with shorter projection lengths on the reference vector’s direction for proximity, given the reference vectors are correctly set. However, for problems with partial FOS, fixed reference vectors are no longer good guidelines, as indicated in Fig. 1.

(a) Ideal guidelines for selection
(b) Reference vectors should have approximated the guidelines
Figure 1: Reality for reference vectors being the guidelines for selection. The FOS is shaded in light blue and the true PF is outlined in dark blue, with the reference vectors drawn as grey quivers. In (a), the best possible guidelines that segment the FOS evenly is given using prior knowledge about the problem; In (b), reference vectors are uniformly generated using the traditional methods, which have huge space left for adaptation.

Reference vector based algorithms with a set of fixed reference vectors are designed for full FOS problems. Previous works show that fixed reference vectors, if not set with prior knowledge, are misleading for the selection of individuals toward diversity for problems with partial true PFs Zhao et al. (2018); Ge et al. (2018). But what do we do if the distribution of the current PF is changed due to the encountered non-trivial infeasible parts of the objective space? Can we adapt the reference vector in-time s.t. they approximate the ideal guidelines at the intersection points of the guidelines with the current PF?

2.2 Related Works & Motivations

There are a lot of existing works for adjusting the reference vectors with the goal of handling irregular PFs.

Some works focus on dealing with the curvature of true PF, MOEA/D-AWA deletes the overcrowded reference vectors and inserting new reference vectors in the sparser spaces Qi et al. (2014); RVEA in Cheng et al. (2016) tilts the unit hyperplane to cater to the distribution of the individuals to address the problems caused by the different magnitudes of objectives; Wu et al. (2018) proposes to use Gaussian process to fit the hypersurface of the current PF for the distribution of reference vectors.

There are also works focusing on the infeasibility of the true PF, i.e. partial true PFs Ge et al. (2018): Approaches like A-NSGA-III Jain and Deb (2014) and RVEA* Cheng et al. (2016) employ the strategy of inserting extra reference lines into crowded areas with the risk of the disturbance of uniformity of the distribution of reference vectors. To deal with the disturbance of uniformity, Zhao and Ge et al. proposed a series of reference vector algorithms with interactive components capable of learning the complex effective areas (scaled projection of the true PF onto the unit hyperplane with L1-norm 1) if the problems are easy enough to converge early.

These works all share one underlying assumption: the MaOP is easy enough for selection engines to obtain population at the vicinity of the true PF. But for the problems with infeasible parts in the objective space and high difficulty, e.g. problems with large dimensionality of the solution space and partial FOS, such assumption will not hold. The inadaptability of the existing algorithms on these problem motivate us to address such problem. In 2013, Wang et al. contributed a prototype algorithm PICEA-w Wang et al. (2013) which co-evolves the set of reference vectors alongside the population for a similar purpose, though not tested and still have the problems for the disturbance of uniformity and so on. Can we extend the adaptation of reference vectors to a in-time manner s.t. they might appropriately serve as the right guidelines of selection for the evolving population? The answer is yes - we found a viable solution in the interacting component framework.

2.3 Inspiration

Here we propose a general idea to the in-time adaptability to the population in the FOS: if we are able to track the current PF of the evolution reasonably precisely, we can use the tracked PF to somehow adapt the reference vectors, in which case we can approximate the ideal selection guidelines, indicated in Fig. 1, using the reference vectors!

The problems left for implementing such idea are mainly two:

  1. 1.

    How to track the current PF in an effective and efficient way?

  2. 2.

    How to adjust the references using the tracked PF?

Our solution to the first problem, i.e. tracking the current PF, is not a parametric model since such model may accidently do interpolation and extrapolation within the infeasible spaces. Thus, we use something simpler: since individuals found during the evolution process never violate the feasibility of the FOS, we just buffer them in an archive! Such idea has been exploited for similar purposes in Praditwong and Yao (2006); Wang et al. (2015).

The second problem is way more tricker. Our solution, the main novelty of this paper, is a delicate mechanism constituted of two subroutines that propagate the distribution of the current PFs onto layers of reference vectors corresponding to different densities. Such mechanism can effectively adapt to shrinking or expanding cross-sections of the current PF and the FOS without with marginal suffering of the disturbance of uniformity.

In this paper, we provide an algorithm TEEA that contains two assisting archives and two interacting engines, aiming to provide versatility for feasibility change of the FOS, as demonstrated in Fig. 2.

3 TEEA


Figure 2: Interaction between the two engines with the participation of the two archives

3.1 Individual Archive

The current PF is constituted by the objective vectors of the individuals that have never been dominated by the individuals ever found in the optimization process. The information of the current PF is necessary to identify and adapt to the changes in the FOS. However, since typical selection methods for population only leaves N individuals, parts of the information about the current PF will be potentially lost. The individuals that have not been dominated should be archived and maintained as another population (apart from the “population” we use to generate offsprings) to keep such information, but not in an ever-increasing manner: we should archive a reasonable amount of individuals that have not been dominated which could well reflect the spatial distribution of the current PF. This calls a proper elimination method for the most undesirable archived individuals. Thus we seek to employ cascade clustering Zhao et al. (2018); Ge et al. (2018), a population selection strategy that keeps the distribution of the population as complete as possible, for the update of such archive, which we named the Individual Archive (IA). Whenever new individuals are found, we use cascade clustering to find and archive the cluster centers (the best individual for the adjacent spaces of reference vectors w.r.t. the criterion of cascade clustering) among the population constituted of newly evolved individuals and the archived individuals. Different to employing cascade clustering for population selection, since IA maintenance only seeks to track the distribution of PF, we only have to buffer the cluster centers that reflect the distribution of PF and discard the rest. This means that IA keeps at least |Z| individuals, where |Z| is the number of participating reference vectors.

Assuming the number of reference vectors is a moderate function of N, i.e. |Z|=𝕆(N), the maintenance of IA is running at the complexity level 𝒪(MN2), which depends on the complexity of cascade clustering.

3.2 RA: Reference Archive

The distribution of the reference vectors that participate the selection should change in-time to cater with the shape of the current PF. We propose another archive for the reference vectors, which we named the Reference Archive (RA). Since the shape of current PF changes, the local densities of reference vectors should also change. Thus we have designed RA to be hierarchical: each layer inside RA only contains reference vectors generated using the same density; layers with different densities can be combined to achieve higher density. The details of updating the RA will be introduced with the adaptation engine of the reference vectors.

\KwInAR (the current RA), i \KwOutZ,𝒂,𝒃 (tuple of reference vectors Z, association vector 𝒂, status vector 𝒃) //get the current density of reference points in RA
d=get_density(AR.li-1);\[email protected]
//generate a set of reference vectors with higher density
Z=uniform_points(d);\[email protected]
//eliminate redundant reference vectors
Z= \ForliAR Z=ZAR.li.Z; Z=Z-Z;//set difference operation
//boolean vector representing if the corresponding reference vector is enabled, all initialized as disabled
𝒃=𝟎|Z|×1;
//index vector associating the reference vectors in the new layer with the ones archived in RA. “associateto(P, Q)” associates 𝒑P with the nearest 𝒒Q
𝒂 = associateto(Z, Z);
\algorithmcfname 1 construct i-th layer of RA: new_layer(AR, i)

3.3 Selection Engine

This paper provides no novelty in neither the generation of offsprings nor selection of population. We employ the selection method of cascade clustering proposed and ameliorated in Zhao et al. (2018); Ge et al. (2018). Briefly speaking, cascade clustering selects a population that is evenly spread w.r.t. the given reference vectors in a cascade style to achieve proximity and diversity. It has shown state-of-the-art effectiveness and efficiency when compared to state-of-the-art reference vector based selection methods. Also, it has flexible interfaces for the adaptations of reference vectors, which suits the need of this paper. We used this to maintain the IA, now we also use this to select the population. Using cascade clustering both to maintain IA and to select population, we can make sure that the reference vectors that could be activated will always keep at least one associated individual, without the fear of losing them in the population selection. Surprisingly, we can also prove that using cascade clustering to do two tasks sequentially is equivalent to doing them two at the same time. The combined selection and update of the IA is demonstrated in Fig. 3.


Figure 3: Combining two cascade clustering processes into one: the equivalent process

The overall runtime complexity for the process is O(MN2), assuming |Z|=𝒪(N). We provide the pseudo code of the selection based on cascade clustering in Alg. 2, without diving into its details since we made no amelioration. For more details, refer to Ge et al. (2018).

\KwInZ (set of reference vectors), P (potential population), N (population size for the next generation) \KwOutP (population for the next generation), I (indices of active reference vectors), Q (cluster centers) //frontier individual identification
[F,NF] frontier_identification(P)\[email protected]
//attach frontiers to reference vectors, return the clusters
Cattach(F, Z, ’point2vector’)\[email protected]
Icluster2indices(C)\[email protected]
\Foreach cluster ciC \Foreach frontier 𝒇𝒋 in ci PDM(𝒇𝒋)mean(𝒇𝒋)+sin(𝒛𝒊,𝒇𝒋); ci.F sort(ci.F, PDM(ci.F), ascend)\[email protected]
Pick out ci.fj with the smallest PDM as ci.center; //attach non-frontiers to clusters
Cattach(NF, C, ’point2center’)\[email protected]
\Foreach cluster ciC ci.NF sort(ci.NF, d(ci.NF,ci.center), ascend)\[email protected]
create selection queue ci.Sci.F,ci.NF\[email protected]
//round-robin picking
\While|P|<N PPPop(ci.S)\[email protected]
imod(i,|C|)+1; //specify the cluster centers
Q; \Foreach cluster ciC QQ{ci.center};
\algorithmcfname 2 Cascade Clustering: P, I, Q = CC(P, Z, N)

Cascade clustering additionally returns the indices of the active reference vectors, which correspond to the reference vectors attached with nondominated individuals, telling the adaptation where the reference vector should be i.e. the distribution of the current PF.

3.4 Adaptation Engine

Reference vectors serve as guidelines for the selection of individuals towards proximity and diversity in the objective space. For problems with full FOS, simple reference vectors generated could serve as the ideal guidelines. However, for objective spaces with unfeasible parts, the ideal guidelines become much more complicated and unadjusted reference vectors are far from ideal guidelines, as demonstrated in Fig. 1 (a). For these objective spaces, sticking to the fixed reference vectors hurts diversity and potentially proximity as well. It is needed to adjust the reference vectors in-time s.t. they are always good approximations to vectors towards the intersection points of the guidelines and the current PF, as illustrated in Fig. 4.

One possible solution to such approximation for guidelines of arbitrarily but continuously shrinking and expanding FOS is evenly distributing sufficient number of reference vectors that intersect with the current PF22 2 The conjecture is metaphoric: the current PF is assumed to be a hypersurface but actually just a set of points in the objective space; Intersection points of reference vectors with the current PF means the orthogonal projection of the objective vectors of the individuals onto the associated reference vectors.. We can easily come up with a brute force method: buffer sets of evenly distributed reference vectors with different generation densities. After each update of the IA, starting from the set of reference vectors corresponding to the lowest density33 3 Any generation density corresponding to generating less than N reference vectors should be excluded from the possible selections of generation density., see how many reference vectors intersect with the current PF and continue to do so until the number of active reference vectors is around N.

The problem with this brute force method is that it is too costly: each adaptation costs approximately 𝒪(MN3). We seek to contribute a smooth and gradual approximation to such brute force method with significantly less computational cost for adaptations. The idea is simple: for the moment of adjustments, if the number of intersection points of reference vectors and the current PF is below N, perhaps due to the shrinkage or insufficient covering of the of current PF, we initialize the “shrink” subrountine: generate a set of reference points with a higher density and save this set in the top layer of RA but only enable those within the adjacent spaces of all currently active reference points (disabled points will not participate in the selection). Then we get all the enabled reference vectors stored in every layer of RA to as the participating set of reference vectors. The set addition of evenly distributed reference points yields approximately evenly distributed reference vectors that will potentially intersect the current PF; If the number of intersection points is greater than N, which is perhaps due to the expansion of the distribution of current PF, we initialize the “expand” subrountine: enable the reference points in the lower layers that are within the adjacent spaces of the active reference points in the densest layer and then delete the densest layer44 4 In implementation, simply disable it.

Precalculating the layers of reference vectors and the angles between them reduces the complexity of this process to 𝒪(MN2), which is exactly what we aim for: significantly lower than the brute force method of 𝒪(MN3).

(a) fixed reference vectors
(b) before RA adjustment
(c) after RA adjustment
Figure 4: The demonstration for the shrinking of the current PF due to the characteristics of the FOS. The FOS is colored in light blue and the PF is outlined in dark blue. The reference vectors located at the edge of the FOS has inactive parts. Thus the distribution is not always appropriate.

In the RA, the reference vectors will be uniformly increased and uniformly reduced, thus the disturbance of the uniformity of the reference vectors is relatively low. Visualization of simulated initializations of adaptation is presented in Fig. 5, with the pseudocode given in Alg. 3.

\KwInAR (RA), AI (IA), Zactive (active reference vectors), N (global population size), θ (tolerance ratio) \KwOutAR (updated RA), Z (participating reference vectors) //subroutine ‘‘SHRINK’’
\If|Zactive|<(1-θ)N //generate a new layer, retrive the corresponding Z,`𝒂,`𝒃
Z,𝒂,𝒃 = new_layer(AR);//algorithm 1
AR=AR{Z,𝒂,𝒃};
//enable the newly generated reference vectors if they are associated with currently active ones in RA
\For𝒛iZ \IfAR[ai]Zactive //the reference vector which 𝒛i is associated to is in Zactive
bi=1;//enable
//subroutine ‘‘EXPAND’’
\If|Zactive|>(1+θ)N //extract all layers of RA, implementation only requires indices
Z|AR|,𝒂|AR|,𝒃|AR| = depack(AR.l|AR|);
Z|AR|-1,𝒂|AR|-1,𝒃|AR|-1 = depack(AR.l|AR|-1);
;
//back-propagate the distribution of the current PF towards the lower layers
\Fori1,,|AR| 𝒂|AR|-i+1 = associateto(Z|AR|-i+1, Z|AR|);//back-associate the points in the lower layer to the last layer, s.t. we can use these associations to enable points in the lower layers
\For𝒛|AR|-i+1iZ|AR|-i+1 \IfAR[ai]ZactiveZ|AR| b|AR|-i+1i=1;//enable if the point in the lower layer is back-associated to the active points in the last layer AR=AR-AR.l|AR|; //symbolic, disabling is wiser //accumulate enabled reference vectors as the set of participating reference vectors
Z=;
\ForliAR Zi,𝒂,𝒃 = depack(l);
\For𝒛jZi \Ifbj=1 Z=Z{𝒛j};
\algorithmcfname 3 Adaptation
(a) Initialization of the current PF (population) and the reference vectors
(b) To shrink: insufficient active reference vectors
(c) Shrinked: enabled points on the second layer associating to previously active points. Active reference vectors still insufficient
(d) Second shrink: appropriate number of active reference vectors
(e) To expand: too many active reference vectors
(f) Expanded: appropriate number of active reference vectors
Figure 5: Demonstration of the subroutines “shrink” and “expand” on an imaginary problem with non-trivial infeasible parts in the FOS: (a). We assume that the population is initialized far away from the true PF, constituting a hyperplane that is roughly parallel to the unit hyperplane. Also, we assume that the currentPF evolves gradually towards the true PF, representing a scenario in which optimization difficulties on the objectives are isotropic; (b). The number of active reference vectors are not enough, thus the subroutine “shrink” is activated: the reference vectors in second layer of RA that are associated with the currently activated reference vectors are enabled; Note that the reason why the reference points in the layers are non-overlapping is that the reference vectors that can be found in previous layers are excluded; (c). After the first shrink, there are more active reference vectors, but still not enough. Need more shrinks; (d). After the second shrink, the number of active reference vectors are appropriate, which means it is roughly equivalent to N=5 in the demonstration; (e). The distribution of current PF expands. The number of active reference vectors is too high and thus the subroutine “expand” will be triggered: the distribution of the active reference vectors (the distribution of the central projection of the current PF onto the unit hyperplane) will be back-propagated from the densest layer to the previous layers via the computed back-associations; (f). After the expansion, there are again appropriate number of reference vectors intersecting the current PF roughly evenly. Notice that the forward- and back-propagation both enables points located on the edge of the central projection s.t. the change of the current PF can be detected using the number of activated reference vectors.

3.5 Combining All Together

We now give the proposed algorithm TEEA which combines the 4 proposed components. At the beginning, to initialize, N individual solutions are randomly generated and AR is initialized with N reference points uniformly generated on the unit hyperplane as the base layer. Then, the main cycle loops:

  1. 1.

    Evolve the old population to get the N offsprings by employing a certain heuristic;

  2. 2.

    Pick N-sized population and new individual archive out of the union of the last population, the newly evolved offsprings and the individuals in the individual archive using cascade clustering. Additionally, provide the activity of reference vectors.

  3. 3.

    Check the stability of the activity of the reference vectors, report whether it is appropriate to adjust the distribution of the reference vectors in the RA according to IA. If not continue;

  4. 4.

    Initialize adaptation, which may trigger one of the two subroutines. The set of participating reference vectors will be provided to the selection engine for selection.

When the termination criteria are satisfied, we pick final population using the individuals buffered in the IA using cascade clustering. Note that in order to make TEEA more stable, i.e. reduce the number of unnecessary adaptation, we introduce a mechanism of checking the stability of activity: if the activity of each reference vector does not change for w generations, the mechanism reports stable else unstable; Else, the adaptation will not be triggered.

4 Experimental Studies

This section gives the analyses for the effectiveness of TEEA on benchmark problems, including the validation of the proposed components, comparison with the state-of-the-art algorithms, etc..

4.1 Settings

The settings of the experiments are identical to those for the CEC’2018 MaOP competition Cheng et al. (2017), which uses the MaF benchmark suite, where 15 scalable many-objective benchmark functions. Among them, 8 cases are with infeasible parts in the objective spaces (MaF1, MaF2, MaF4, MaF6 - MaF9, MaF15), 7 cases are with full FOS (MaF3, MaF5, MaF10 - MaF14). Specifically, MaF8 and MaF9 are with narrow FOS, MaF6 is with extremely narrow (degenerate) FOS, MaF3 and MaF13 are hard to converge, MaF7 is with disconnected FOS and PF, MaF14 and MaF15 are with large-scale solution space. Each benchmark function is scaled to three separate test cases, in which M=5, M=10 or M=15. There is an injective mapping between M and D for each function, and the number of maximum evaluations for each algorithm is set to be D(M)×104. All algorithms are restrict to using the population size of N=240, as well as the classical simulated binary crossover and polynomial mutation optimizer.

Results given in this section are averaged over 20 independent runs on identical platforms with MATLAB R2019a. The source code55 5 The MATLAB source code will be published in Mingde Zhao’s github: https://github.com/PwnerHarry/ of TEEA is implemented with PlatEMO 2.0 Tian et al. (2017).

We evaluate the quality of the obtained populations using the Inverse Generational Distance based on L2 norm (L2-IGD, often abbreviated as IGD), a problem dependent evaluation criterion of both proximity and diversity Czyzżak and Jaszkiewicz (1998). IGD calculates the average minimum distance from the sample points on the true PF to the points of the population. The smaller the IGD, the better the proximity and diversity.

4.2 Hyperparameters

Here we want to determine a fixed set of hyperparameters for all the following experiments, instead of overfitting them for each problem, for fairness of comparison. Also, we want to check if the proposed algorithm TEEA is sensitive to these hyperparameters. There are two hyperparameters in TEEA, the window size w to suggest adaptation moments and tolerance ratio θ for the sensitivity of adaptation w.r.t. the reference vector activities.

Note that when the number of objectives is large, θ becomes sensitive for the difficulty in generating uniform reference vectors. When the number of evaluations given is low, w becomes sensitive in a problem-specific way for it controls the trade-off between adaptation accuracy (the accuracy of the moments of the adaptation being initialized) and number of adaptations before finish. We select the test case DTLZ7 Deb et al. (2005) specifically since it has fractal FOS and intermediate number of evaluations, which is likely an expected scenario for the experiments to come.

For the sake of fair comparison, the set of problems we use for hyperparameter analysis does not have intersection with the experiments later. The results are given in Tab. 1. Observing from the results, we can find an approximate interval for the hyperparameters to achieve similarly good performance. We choose w=20 and θ=0.2 since the change of performance around this point is relatively modest and also for its good performance.

Table 1: Hyperparameter Pairs on DTLZ7 (M=5)
w\θ 0.05 0.1 0.15 0.2 0.25
10 6.63e-1 4.85e-1 3.44e-1 3.00e-1 3.25e-1
15 6.52e-1 4.65e-1 3.02e-1 2.93e-1 3.19e-1
20 4.75e-1 4.23e-1 2.80e-1 2.85e-1 3.13e-1
25 3.18e-1 2.99e-1 2.78e-1 2.78e-1 3.02e-1
50 5.66e-1 4.85e-1 4.21e-1 4.33e-1 4.65e-1
Color indicators are added to for assisting the reading of the results. The greener the better performance, the redder the worse.
Averaged from 20 runs on DTLZ7 with M=5, D=24 and FEs=2.4e5

4.3 Comparative Tests on MaF Benchmark Suite

We try to analyze the characteristics of TEEA by comparing it with state-of-the-art algorithms including CVEA3 Yuan et al. (2018), BCE-IBEA Li et al. (2016), fastCAR Zhao et al. (2018), CLIA Ge et al. (2018), AR-MOEA Tian et al. (2017), A-NSGA-III Jain and Deb (2014) and RVEA* Cheng et al. (2016). Among them, fastCAR, CLIA, A-NSGA-III and RVEA* are reference vector based with adaptation methods. The characteristics of the compared algorithms are presented in Tab. 2.

Table 2: Details for Compared Algorithms
Algorithms Comments
TEEA The algorithm proposed in this paper.
CVEA3 Cost value based MaOEA 3, best-performing (1st) participants of CEC’2018 MaOP competition Yuan et al. (2018). The evolution operator has been set to default for fair comparison.
BCE-IBEA Bi-criterion variant of IBEA Zitzler and Künzli (2004), one of the most best-performing (3rd) participants of CEC’2018 MaOP competition Li et al. (2016).
fastCAR Reference vector based MaOEA with periodic adaptation based on margin learning, one of the most best-performing (4th) participants of CEC’2018 MaOP competition Zhao et al. (2018).
CLIA Improvement upon fastCAR. Reference vector based MaOEA with incremental learning of the PF via component interactions Ge et al. (2018).
AR-MOEA An indicator based MaOEA based on adaptive reference points Tian et al. (2017).
A-NSGA-III NSGA-III with reference vector adaptation Jain and Deb (2014).
RVEA* RVEA with reference vector adaptation Cheng et al. (2016).

The averaged IGD results of the algorithms are presented in Tab. 3, with additional statistical information including Friedman tests and paired t-tests, etc.. The Friedman tests suggest that with high confidence we can say that TEEA the rankings of the algorithms are effective, therefore telling that TEEA achieves the overall performance. The paired t-tests suggest that TEEA beats several compared state-of-the-art algorithms and achieves similar performance with others. The Friedman tests, t-tests as well as straight preservations from the result table all suggest that TEEA achieves undeniably competitive performance on such set of complex benchmarks problems.

Table 3: Averaged IGD Results on MaF Suite
Problem M TEEA CVEA3 BCE-IBEA fastCAR CLIA AR-MOEA A-NSGA-III RVEA*
Mean Std Mean Std Mean Std Mean Std Mean Std Mean Std Mean Std Mean Std
F1 5 1.08e-1 7.25e-4 1.13e-1 1.26e-3 1.08e-1 2.70e-4 1.12e-1 1.20e-2 1.10e-1 7.55e-4 1.41e-1 1.89e-3 1.58e-1 1.45e-2 1.46e-1 8.94e-3
10 2.33e-1 7.45e-3 2.52e-1 4.42e-3 2.37e-1 8.84e-3 2.96e-1 4.67e-7 2.39e-1 3.09e-3 2.52e-1 1.29e-3 2.73e-1 1.11e-2 3.47e-1 2.86e-2
15 2.67e-1 8.85e-3 3.28e-1 4.51e-3 2.95e-1 3.09e-3 3.24e-1 8.07e-12 2.63e-1 2.81e-3 2.88e-1 1.25e-2 3.15e-1 6.29e-3 4.32e-1 3.34e-2
F2 5 9.65e-2 2.85e-3 9.35e-2 1.93e-3 8.72e-2 1.16e-3 1.08e-1 1.86e-1 9.60e-2 1.67e-3 1.11e-1 1.03e-3 1.06e-1 1.80e-3 1.09e-1 1.95e-3
10 1.53e-1 2.99e-3 1.76e-1 4.34e-3 1.73e-1 5.04e-3 2.65e-1 2.08e-1 1.60e-1 1.62e-3 2.06e-1 8.73e-3 2.32e-1 2.45e-2 2.71e-1 1.19e-2
15 1.64e-1 4.35e-3 1.98e-1 2.26e-2 2.86e-1 8.86e-3 2.57e-1 1.18e-1 1.66e-1 2.52e-3 2.26e-1 1.02e-2 2.20e-1 1.28e-2 2.84e-1 2.12e-2
F3 5 6.82e-2 1.90e-3 5.93e-2 7.58e-4 1.64e-1 4.48e-2 6.43e-2 9.99e-1 6.36e-2 1.12e-3 9.71e-2 1.74e-3 8.30e-2 2.45e-2 7.68e-2 2.65e-3
10 8.36e-2 2.20e-3 1.69e-1 1.19e-2 6.50e-1 3.02e-1 8.31e-2 1.00e0 7.45e-2 2.60e-3 4.08e0 1.18e1 5.61e6 2.01e7 7.73e0 8.66e0
15 9.11e-2 1.27e-3 1.88e-1 3.38e-2 5.45e-1 1.27e-1 8.97e-2 1.00e0 8.56e-2 6.07e-4 5.48e1 1.51e2 2.35e2 6.16e2 2.66e-1 6.95e-1
F4 5 1.77e0 3.25e-2 1.70e0 2.24e-2 3.07e0 1.03e0 2.01e0 9.79e-2 1.90e0 6.47e-2 2.45e0 9.25e-2 2.35e0 1.49e-1 2.09e0 9.62e-2
10 7.50e1 2.25e0 5.10e1 2.12e0 9.34e1 2.53e0 7.81e1 3.13e-6 1.00e2 2.36e0 9.68e1 6.42e0 9.74e1 6.05e0 7.97e1 7.62e0
15 2.80e3 1.23e2 1.49e3 1.58e1 2.74e3 1.02e3 3.24e3 1.10e-11 3.62e3 1.62e1 4.01e3 5.26e2 3.74e3 3.39e2 2.77e3 2.46e2
F5 5 1.96e0 9.55e-3 2.28e0 1.10e0 1.75e0 3.35e-2 1.97e0 8.12e-1 1.93e0 8.36e-3 2.39e0 8.01e-1 1.97e0 4.90e-3 2.26e0 8.41e-1
10 8.82e1 1.19e0 6.04e1 1.99e1 4.88e1 1.70e0 8.82e1 9.68e-1 7.80e1 1.03e0 9.87e1 4.11e0 8.08e1 4.10e0 7.03e1 1.39e1
15 2.35e3 3.07e2 1.38e3 1.15e3 1.37e3 5.75e1 2.43e3 9.90e-1 2.29e3 2.05e2 3.31e3 4.44e2 2.32e3 6.54e2 2.54e3 7.97e2
F6 5 1.82e-3 1.43e-4 2.20e-3 6.87e-5 2.10e-3 7.29e-6 6.51e-3 1.29e-1 2.40e-3 3.32e-4 4.22e-3 5.63e-5 1.82e-2 1.14e-2 2.37e-2 4.98e-3
10 2.75e-3 6.25e-4 1.99e-3 4.35e-5 4.57e-1 3.57e-1 2.49e-2 9.91e-2 3.40e-2 1.28e-4 1.08e-1 1.40e-1 3.02e0 4.22e0 4.39e-2 4.22e-2
15 1.03e-2 9.25e-3 2.43e-2 8.09e-2 7.46e-1 7.25e-3 6.56e-2 9.24e-2 4.24e-2 1.50e-2 2.38e-1 1.25e-1 1.55e1 1.37e1 1.68e-1 1.91e-1
F7 5 2.78e-1 3.25e-3 2.89e-1 1.51e-1 2.27e-1 4.63e-3 2.91e-1 2.57e-1 2.70e-1 6.11e-3 3.29e-1 7.55e-3 2.81e-1 1.68e-2 2.11e-1 3.92e-3
10 8.35e-1 1.03e-1 8.49e-1 3.44e-3 8.28e-1 5.58e-2 1.60e0 1.61e-1 8.53e-1 3.69e-2 1.57e0 9.55e-2 1.18e0 9.98e-2 9.09e-1 1.36e-1
15 1.80e0 3.40e-1 1.64e0 4.62e-2 2.16e0 1.77e-1 1.48e1 1.91e-2 2.16e0 1.67e-1 4.06e0 6.97e-1 3.17e0 4.49e-1 1.77e0 4.43e-1
F8 5 9.17e-2 4.32e-3 8.11e-2 7.52e-3 7.35e-2 6.34e-4 9.86e-2 1.22e-1 7.88e-2 2.65e-3 1.29e-1 4.45e-3 1.50e-1 1.74e-2 2.67e-1 5.42e-2
10 1.35e-1 4.99e-3 1.25e-1 8.02e-3 1.09e-1 5.52e-4 2.98e-1 6.80e-3 1.46e-1 4.02e-3 1.37e-1 4.17e-3 3.28e-1 6.06e-2 9.74e-1 1.68e-1
15 1.93e-1 1.87e-2 1.60e-1 1.06e-2 1.32e-1 7.36e-4 6.05e-1 5.66e-5 1.90e-1 9.46e-3 1.73e-1 6.24e-3 3.61e-1 6.02e-2 1.45e0 3.20e-1
F9 5 9.17e-2 6.35e-3 9.62e-2 1.53e-2 2.76e-1 9.57e-2 9.34e-2 3.16e-1 7.87e-2 5.90e-3 1.27e-1 8.24e-3 2.43e-1 1.13e-1 1.87e-1 2.98e-2
10 1.85e-1 1.27e-2 1.98e-1 7.86e-2 2.49e0 3.42e-2 2.32e-1 1.35e-2 3.38e-1 6.58e-2 1.77e-1 8.20e-3 5.98e-1 2.14e-1 9.31e-1 2.16e-1
15 2.25e-1 6.25e-2 1.76e-1 1.21e-1 3.34e0 5.50e0 6.40e-1 2.67e-4 3.48e-1 1.45e-1 1.51e-1 5.40e-3 2.65e0 4.63e0 1.14e0 2.02e-1
F10 5 3.87e-1 1.45e-2 4.43e-1 1.82e-2 3.67e-1 1.87e-3 5.02e-1 9.44e-1 3.73e-1 8.63e-3 4.71e-1 1.09e-2 4.59e-1 3.35e-2 6.30e-1 7.66e-2
10 1.02e0 2.98e-2 1.40e0 5.19e-2 9.66e-1 1.94e-2 1.02e0 9.98e-1 1.05e0 3.73e-2 1.20e0 5.94e-2 1.07e0 4.19e-2 1.35e0 9.24e-2
15 1.38e0 3.91e-2 2.12e0 1.05e-1 1.50e0 4.59e-2 1.63e0 9.98e-1 1.52e0 6.54e-2 1.96e0 8.53e-2 1.76e0 5.59e-1 2.00e0 6.39e-2
F11 5 3.89e-1 1.93e-3 4.46e0 9.22e-3 4.78e-1 9.35e-3 6.58e-1 9.95e-1 6.35e-1 1.26e-2 8.25e-1 1.72e-2 1.04e0 5.00e-1 2.42e0 8.17e-1
10 1.15e0 2.26e-2 1.47e0 4.80e-2 1.18e0 2.21e-2 4.76e0 9.96e-1 3.00e0 1.28e0 2.56e0 4.39e-1 5.70e0 6.26e-1 1.06e1 2.20e0
15 1.43e0 3.71e-2 2.17e0 4.47e-2 1.65e0 4.79e-2 6.21e0 9.95e-1 8.52e0 2.79e0 4.68e-1 6.22e-1 1.29e1 1.60e0 1.50e1 2.36e0
F12 5 9.36e-1 3.43e-3 9.63e-1 9.78e-3 9.37e-1 8.21e-3 9.35e-1 7.76e-1 9.33e-1 2.51e-3 1.12e0 7.95e-3 9.40e-1 8.33e-3 9.68e-1 9.86e-3
10 4.60e0 1.81e-2 4.17e0 2.26e-2 4.15e0 8.14e-3 4.60e0 8.98e-1 4.35e0 2.06e-2 4.69e0 1.44e-2 4.55e0 2.34e-1 4.49e0 4.47e-2
15 7.73e0 8.12e-2 7.19e0 6.65e-2 7.24e0 1.15e-1 7.80e0 8.26e-1 7.56e0 1.58e-1 7.62e0 1.66e-1 8.48e0 3.55e-1 8.40e0 1.47e-1
F13 5 8.83e-2 1.25e-2 7.79e-2 1.27e-2 1.25e-1 4.12e-2 1.30e-1 2.61e-1 1.20e-1 1.19e-2 1.30e-1 6.74e-3 1.63e-1 1.72e-2 4.51e-1 9.00e-2
10 1.12e-1 5.25e-2 9.98e-2 6.79e-3 1.22e-1 7.88e-3 3.23e-1 1.32e-1 2.17e-1 1.15e-2 1.22e-1 7.21e-3 2.30e-1 2.00e-2 4.52e-1 9.87e-2
15 1.43e-1 6.35e-2 1.25e-1 1.18e-2 1.34e-1 7.74e-3 4.43e-1 5.24e-2 2.49e-1 3.90e-2 1.50e-1 9.25e-3 2.51e-1 3.13e-2 6.38e-1 7.86e-2
F14 5 3.42e-1 2.47e-2 2.26e-1 4.65e-2 5.30e-1 7.32e-2 3.52e-1 6.01e-1 3.50e-1 2.28e-2 3.85e-1 4.24e-2 7.14e-1 2.36e-1 5.26e-1 8.67e-2
10 5.49e-1 7.73e-2 8.49e-1 1.69e-1 2.45e1 3.53e1 6.95e-1 5.24e-1 5.54e-1 5.09e-2 6.76e-1 8.35e-2 2.34e0 1.31e0 7.22e-1 5.71e-2
15 6.22e-1 1.43e-1 1.12e0 1.45e-1 1.80e1 1.60e1 7.28e-1 4.39e-1 6.19e-1 1.33e-1 6.04e-1 6.46e-2 1.32e0 2.61e-1 8.92e-1 1.30e-1
F15 5 3.06e-1 4.33e-2 2.33e-1 5.41e-2 9.52e-1 4.41e-2 4.21e-1 4.28e-2 3.68e-1 4.02e-2 6.13e-1 2.57e-2 9.94e-1 9.40e-2 6.17e-1 4.06e-2
10 8.99e-1 9.81e-2 1.03e0 2.05e-1 9.19e0 1.35e1 9.97e-1 4.43e-7 9.84e-1 4.72e-2 8.52e-1 5.04e-2 1.53e0 2.67e-1 9.81e-1 6.80e-2
15 1.02e0 1.35e-1 1.93e0 4.95e-1 1.42e0 9.17e-2 1.14e0 5.14e-11 1.04e0 4.51e-2 1.35e0 7.91e-2 3.89e0 2.03e0 1.27e0 5.40e-2
Friedman 5 2.57 3.53 3.83 4.60 2.53 6.30 6.30 6.33
10 2.50 3.37 3.77 5.17 3.93 4.87 6.40 6.00
15 2.80 3.40 4.30 5.20 3.30 4.60 6.40 6.00
partial 2.31 2.87 4.28 5.39 3.57 5.02 6.52 6.04
full 2.97 4.39 3.72 4.19 2.97 5.61 5.97 6.17
overall 2.62 3.43 3.97 4.99 3.26 5.26 6.37 6.11
t-test 5 5/3/7 8/2/5 3/12/0 7/2/6 15/0/0 13/2/0 13/1/1
10 7/3/5 8/3/4 9/5/1 11/1/3 10/5/0 11/3/1 11/2/2
15 8/1/6 10/2/3 10/4/1 8/5/2 8/3/4 14/1/0 11/3/1
partial 8/11/8 16/6/5 20/7/0 18/5/4 20/4/3 24/3/0 23/2/2
full 11/1/6 11/0/7 4/14/0 0/18/0 13/3/2 12/5/1 14/2/2
overall 19/9/17 26/7/12 22/23/0 23/10/12 34/6/5 40/3/2 38/3/4
For Friedman test, α=0.05, pα; For t-test, α=0.05. The results of the t-tests are provided as “l/u/g”, where l represents the mean value of the proposed algorithm, for a certain test case, is significantly less than the compared algorithm, g represents a greater result, and u represents that it is uncertain to say whether the mean value of the proposed algorithm TEEA is greater or less than the compared algorithm.

To observe the performance of TEEA more intuitively, we provide the comparison graph between the obtained PF with median IGD among the independent runs and the ground truth PF on 4 test cases with partial FOS. We can see from these figures that, apart from the noise caused by evolution, TEEA obtains intuitively satisfying performance.

(a) MaF1, M=5
(b) MaF2, M=5
(c) MaF6, M=5
(d) MaF13, M=5
Figure 6: Visualization of the obtained PF with median IGD: for comparison, the true PF is also plotted in the parallel coordinates with lighter color (as the background).

To specify the performance on each type of algorithms, we offer two dichotomies for the test cases, first of which is based on the number of objectives and second is based on the feasibility of the FOS. We conduct Friedman tests and t-tests on the partitioned sets of the test cases and present the statistics in Tab. 3. Also, to intuitively compare the performance, we convert the results of the Friedman test into a radar diagram presented in Fig. 7. From the categorical observations we find that

  1. 1.

    TEEA has competitive performance within all partitions of test cases, ranking at least the 2nd among the state-of-the-art algorithms;

  2. 2.

    TEEA has leading performance in test cases with partial FOS. This shows its effectiveness.

  3. 3.

    TEEA and CLIA achives highest performance on the test cases with full FOS, this shows the effectiveness of the selection operator cascade clustering;

  4. 4.

    TEEA has arguably similar performance (from t-test) when compared to CVEA3, which strongly shows the competitiveness of its performance.

Figure 7: Radar visualization of the performance on different types of test cases: the wider the distribution of the included areas, the better the performance. For each dimension of the radar diagram, the values are |A|-rF, where |A|=8 is the number of compared algorithms and rF is the mean rank given by the Friedman tests.

4.4 Comparative Tests on Difficult Problem with Infeasible Parts in Objective Spaces

To validate the effectiveness of the main contribution of this work, we extend the dimensionality of the solution space of the MaF15 problem into D{100,200,300,400,500} for M{5,10}, constituting 10 extremely difficult test cases. These test cases simulate the scenarios in which the population hardly reaches vicinity of the true PF and the objective space is with infeasible parts. We compare TEEA with the reference vector based opponents with adaptation, including fastCAR Zhao et al. (2018), CLIA Ge et al. (2018), AR-MOEA Tian et al. (2017), A-NSGA-III Jain and Deb (2014) and RVEA* Cheng et al. (2016). The results are given in Tab. 4.

Table 4: Results on A Difficult Problem with Infeasible Parts in the Objective Space
MaF15 TEEA fastCAR CLIA A-NSGA-III RVEA*
M D Mean Std Mean Std Mean Std Mean Std Mean Std
5 100 3.06e-1 3.25e-3 4.21e-1 4.28e-2 3.68e-1 4.02e-2 9.94e-1 9.40e-2 6.17e-1 4.06e-2
200 3.08e-1 2.05e-2 4.65e-1 7.68e-2 3.69e-1 4.03e-2 9.22e-1 8.51e-3 6.54e-1 3.05e-2
300 3.08e-1 1.85e-2 4.77e-1 8.25e-2 3.69e-1 5.79e-2 8.75e-1 5.44e-2 6.60e-1 2.48e-2
400 3.20e-1 1.43e-2 4.78e-1 6.35e-2 3.72e-1 2.25e-2 8.88e-1 6.45e-2 6.71e-1 2.94e-2
500 3.28e-1 7.65e-2 4.90e-1 4.52e-2 3.76e-1 5.85e-2 8.93e-1 5.33e-2 6.40e-1 3.45e-2
10 100 8.85e-1 3.25e-2 9.98e-1 3.85e-7 9.67e-1 3.25e-2 1.43e0 2.13e-1 9.32e-1 7.25e-2
200 9.02e-1 4.67e-2 9.97e-1 4.43e-7 9.84e-1 4.72e-2 1.53e0 2.67e-1 9.81e-1 6.80e-2
300 9.06e-1 4.32e-2 1.03e0 1.33e-2 9.95e-1 6.85e-2 1.65e0 3.57e-1 9.49e-1 1.21e-1
400 9.10e-1 5.25e-2 1.25e0 2.25e-1 1.13e0 8.25e-2 1.43e0 3.65e-1 1.03e0 1.33e-1
500 9.18e-1 8.33e-2 1.44e0 3.45e-1 1.25e0 1.37e-1 1.55e0 4.45e-1 1.24e0 2.37e-1
For each test case, FE=D×104. Results averaged from 20 indepedent runs on each test case.

It can be observed that TEEA validates its own effectiveness by achieving the best performance with an overwhelming margin.

4.5 Stability Comparison using Confidence Interval Integration

Stability is of great importance for optimization algorithms. We analyze the stability of TEEA by comparing it with those of the state-of-the-art algorithms. The evaluation criterion for stability is the scaled approximated size of area of the logarithmically-scaled confidence intervals. Let τ=𝒎,𝒃l,𝒃u be the sampled statistical information (at time t1,,T) of the IGD of a certain algorithm on a certain problem, where 𝒎 is the mean IGD of independent runs at the sample times as well as 𝒃l and 𝒃u be the lowerbound and upperbound of the confidence interval calculated from the IGD values of the independent runs at the sample times, the stability criterion V(τ) is computed as:

V(τ)=i=1Tlogbu(i)-logbl(i)

Note that the criterion requires the sample of points are the same, which we set to 101 points per test case. The smaller the criterion, the better the stability (less variance). The results of such criterion on all test cases is provided in Tab. 5.

Table 5: Confidence Interval Integration Results for Stability
Problem M TEEA CVEA3 BCE-IBEA fastCAR CLIA AR-MOEA A-NSGA-III RVEA*
F1 5 7.66e+1 8.48e0 3.54e+1 6.61e0 7.65e+1 2.36e+1 1.48e+1 9.06e+1
10 8.51e+1 5.76e+1 8.02e+1 6.17e+1 3.98e+1 9.97e0 4.25e0 3.73e+1
15 8.58e+1 5.37e+1 4.56e+1 8.65e+1 3.92e+1 9.31e0 7.05e+1 4.62e+1
F2 5 4.20e+1 2.61e+1 5.85e0 2.41e+1 2.10e+1 1.06e0 2.37e+1 2.64e+1
10 4.45e+1 5.72e+1 5.83e+1 2.75e+1 1.87e+1 7.71e+1 3.38e+1 6.65e0
15 5.47e+1 3.01e+1 7.72e+1 6.79e0 3.28e+1 6.95e-1 1.05e+1 3.20e+1
F3 5 2.44e+1 7.18e+1 5.41e+1 9.99e+1 4.58e0 2.05e+1 3.40e+1 4.19e+1
10 3.28e+1 6.34e0 4.30e+1 9.98e+1 3.88e+1 6.84e0 2.62e+1 8.45e+1
15 8.71e0 5.34e+1 9.12e0 9.47e+1 7.35e+1 1.76e+1 7.07e+1 7.91e+1
F4 5 4.90e+1 2.84e+1 1.26e0 9.31e+1 7.24e+1 9.01e+1 1.40e+1 9.14e+1
10 3.08e+1 2.74e+1 3.66e+1 4.63e+1 3.10e+1 6.85e+1 6.69e+1 7.26e+1
15 7.42e0 1.83e+1 7.31e-1 4.01e0 1.99e+1 6.37e+1 4.90e+1 3.82e+1
F5 5 8.52e+1 4.02e0 5.19e+1 8.92e+1 7.60e+1 9.12e+1 6.62e+1 8.93e+1
10 7.61e0 3.00e+1 2.32e+1 9.05e+1 1.10e0 5.29e+1 5.80e+1 1.40e+1
15 3.97e+1 5.21e0 6.87e+1 9.41e+1 3.04e+1 5.91e+1 7.63e+1 7.87e+1
F6 5 1.42e+1 7.22e+1 7.19e+1 1.04e+1 5.13e+1 7.12e+1 4.92e0 5.76e+1
10 7.82e+1 6.13e+1 5.35e+1 1.00e2 9.62e0 1.47e+1 6.20e+1 5.79e+1
15 7.83e+1 8.18e+1 8.30e+1 8.90e+1 1.73e+1 9.30e0 1.23e+1 2.41e+1
F7 5 4.63e+1 1.54e+1 6.54e+1 4.04e+1 7.49e+1 7.75e+1 2.16e+1 5.15e+1
10 1.23e0 4.44e+1 6.19e+1 2.04e+1 5.43e+1 7.95e+1 9.23e+1 1.31e+1
15 5.34e+1 6.39e+1 2.42e+1 2.31e+1 1.93e+1 7.98e+1 5.43e+1 5.65e+1
F8 5 5.29e+1 7.62e+1 7.12e+1 7.84e0 4.07e+1 5.94e+1 2.41e+1 7.23e+1
10 6.51e+1 8.74e+1 7.43e+1 7.57e+1 5.57e+1 5.67e+1 6.84e+1 2.13e+1
15 2.28e+1 2.14e0 7.96e+1 6.27e+1 9.02e+1 7.18e+1 7.58e+1 4.45e+1
F9 5 7.72e+1 1.75e+1 9.07e+1 4.41e+1 7.15e+1 7.80e+1 4.29e0 4.57e+1
10 9.62e0 7.49e+1 5.32e+1 1.25e+1 7.56e+1 8.10e+1 3.33e+1 3.16e+1
15 6.77e+1 7.34e0 7.32e+1 4.22e+1 1.56e+1 6.95e+1 5.69e+1 2.66e+1
F10 5 1.48e+1 2.10e+1 2.23e+1 7.93e+1 7.66e+1 3.40e0 4.94e+1 7.28e+1
10 4.39e+1 6.24e+1 2.53e+1 9.00e+1 5.39e+1 7.50e+1 6.06e+1 8.35e+1
15 5.32e+1 1.80e0 6.00e+1 8.44e+1 7.52e+1 7.70e+1 7.33e+1 7.90e+1
F11 5 2.58e+1 7.88e+1 8.80e+1 8.19e+1 1.01e+1 2.04e+1 5.74e+1 8.38e+1
10 3.32e+1 6.84e+1 2.81e+1 9.06e+1 9.57e0 5.79e+1 7.81e+1 3.01e+1
15 5.18e+1 5.81e+1 5.87e+1 9.04e+1 3.83e+1 7.89e+1 1.73e+1 3.30e+1
F12 5 5.18e+1 9.13e+1 8.10e+1 8.30e+1 3.31e+1 7.71e+1 7.84e+1 9.43e+1
10 2.49e+1 3.15e+1 8.68e+1 8.62e+1 3.04e+1 1.50e+1 3.29e+1 5.70e+1
15 9.06e+1 7.20e+1 5.79e0 7.80e+1 1.69e+1 2.10e+1 4.74e+1 1.58e+1
F13 5 9.41e0 9.46e0 5.91e+1 3.51e+1 6.57e0 7.79e+1 2.20e+1 9.25e+1
10 6.34e+1 7.22e+1 8.76e+1 1.06e+1 5.89e0 6.97e+1 2.79e+1 8.51e+1
15 7.31e+1 6.98e0 7.69e+1 7.05e+1 4.81e+1 8.30e+1 4.21e+1 7.60e+1
F14 5 3.79e+1 6.72e+1 8.25e+1 6.34e+1 3.04e+1 5.58e+1 3.60e+1 8.34e+1
10 8.89e+1 2.15e+1 4.91e+1 6.71e+1 6.94e+1 8.53e+1 9.88e0 6.74e+1
15 1.51e+1 1.47e+1 2.00e+1 6.38e+1 1.08e+1 7.57e+1 3.37e+1 1.09e+1
F15 5 5.78e+1 7.15e+1 5.93e+1 6.37e+1 5.52e+1 3.36e+1 8.84e+1 6.11e+1
10 9.75e+1 2.78e+1 1.09e+1 5.57e+1 6.19e+1 6.54e+1 4.29e+1 7.47e+1
15 1.14e+1 5.97e+1 8.79e+1 6.98e+1 5.66e+1 8.63e+1 2.55e+1 5.92e+1
Friedman 4.07 4.20 4.93 5.38 3.40 4.84 4.02 5.16
For Friedman test, α=0.05, p=4.20×10-3 and χ2=20.74.

The Friedman test shows that TEEA has modest stability among all compared algorithms, showing that the achieved competitive performance in the experiments is in a degree reliable.

4.6 Component Validation

4.6.1 Effectiveness of Individual Archive

We demonstrate the effectiveness of IA by comparing the performance TEEA has achieved to the version of TEEA without IA (using the population to adjust references instead of IA). The results are given in Tab. 6.

Table 6: Comparison for TEEA with IA and TEEA without IA
Problem M TEEA TEEA w/o IA
Mean Std Mean Std
F1 5 1.08e-1 7.25e-4 1.16e-1 7.84e-4
10 2.33e-1 7.45e-3 2.50e-1 7.94e-3
15 2.67e-1 8.85e-3 2.85e-1 8.96e-3
F2 5 9.65e-2 2.85e-3 1.00e-1 2.89e-3
10 1.53e-1 2.99e-3 1.50e-1 3.01e-3
15 1.64e-1 4.35e-3 1.67e-1 4.53e-3
F3 5 6.82e-2 1.90e-3 6.76e-2 1.94e-3
10 8.36e-2 2.20e-3 8.68e-2 2.35e-3
15 9.11e-2 1.27e-3 8.93e-2 1.38e-3
F4 5 1.77e0 3.25e-2 1.78e0 3.49e-2
10 7.50e1 2.25e0 7.60e1 2.39e0
15 2.80e3 1.23e2 2.95e3 1.28e2
F5 5 1.96e0 9.55e-3 1.98e0 9.93e-3
10 8.82e1 1.19e0 8.82e1 1.21e0
15 2.35e3 3.07e2 2.51e3 3.21e2
F6 5 1.82e-3 1.43e-4 1.95e-3 1.53e-4
10 2.75e-3 6.25e-4 2.92e-3 6.35e-4
15 1.03e-2 9.25e-3 1.08e-2 1.01e-2
F7 5 2.78e-1 3.25e-3 2.85e-1 3.50e-3
10 8.35e-1 1.03e-1 8.16e-1 1.06e-1
15 1.80e0 3.40e-1 1.81e0 3.67e-1
F8 5 9.17e-2 4.32e-3 9.15e-2 4.34e-3
10 1.35e-1 4.99e-3 1.39e-1 5.23e-3
15 1.93e-1 1.87e-2 2.07e-1 1.87e-2
F9 5 9.17e-2 6.35e-3 9.71e-2 6.87e-3
10 1.85e-1 1.27e-2 1.81e-1 1.37e-2
15 2.25e-1 6.25e-2 2.40e-1 6.56e-2
F10 5 3.87e-1 1.45e-2 3.96e-1 1.48e-2
10 1.02e0 2.98e-2 1.07e0 3.04e-2
15 1.38e0 3.91e-2 1.35e0 4.06e-2
F11 5 3.89e-1 1.93e-3 3.96e-1 1.96e-3
10 1.15e0 2.26e-2 1.16e0 2.29e-2
15 1.43e0 3.71e-2 1.52e0 3.75e-2
F12 5 9.36e-1 3.43e-3 9.69e-1 3.42e-3
10 4.60e0 1.81e-2 4.70e0 1.81e-2
15 7.73e0 8.12e-2 7.77e0 8.34e-2
F13 5 8.83e-2 1.25e-2 9.39e-2 1.27e-2
10 1.12e-1 5.25e-2 1.12e-1 5.33e-2
15 1.43e-1 6.35e-2 1.43e-1 6.39e-2
F14 5 3.42e-1 2.47e-2 3.55e-1 2.54e-2
10 5.49e-1 7.73e-2 5.79e-1 7.96e-2
15 6.22e-1 1.43e-1 6.52e-1 1.54e-1
F15 5 3.06e-1 4.33e-2 3.26e-1 4.48e-2
10 8.99e-1 9.81e-2 8.98e-1 9.82e-2
15 1.02e0 1.35e-1 1.04e0 1.46e-1
t-test 16/22/7

It can be observed that generally TEEA with IA performs better than TEEA without IA. IA can bring changes for performance on different types of problems: reason for performing better on problems with full FOS, IA prevents the adaptation engine from mistakenly initiating due to the stochastic perturbations of the evolutionary process; On problems with partial FOS, apart from the first reason, the additional is that IA keeps all the individuals that could activate the reference vectors, whereas in the population, due to the limit of the size of population, useful cluster centers are often lost.

4.6.2 Effectiveness of Adaptation

To validate the effectiveness of RA, we feed the adaptation engine with different types of simulated tracked PFs to check if the adaptation engine can effectively adapt the reference vectors to the distribution of the current PF. To visually verify the effectiveness, we have crafted 4 different simulated test cases, where respectively, the current PF is segmented into parts by the imaginary infeasible parts of the corresponding objective spaces. The results are given in Fig. 8.

(a) N=24,#active=23
(b) N=24,#active=22
(c) N=24,#active=19
(d) N=24,#active=24
Figure 8: Results of adaptation on the simulated test cases: we simulate irregular and fractal PFs in 2D objective spaces to test the effectiveness of the adaptation. The dots on near the current PF shows the intersection points of the adapted reference vectors and the current PF. The 4 plotted simulation results are persuasive: adaptation can deal with the complicated fractal PFs well, with maximum inaccuracy of 5/24=20.83% in test case 3; But there is no need to worry, since such inaccuracy will decrease with the increment of N. The current N=24 is too small, which is only chosen for visualization purposes.

Moreover, to test the Markov properties (the adaptation is expected to be Markovian since past adaptations should not influence the future), we shuffle the 4 simulated test cases to 24 possible permutations and output the similarity of the adaptation results (percentage of the identically enabled points in the RA) on these sequences in Tab. 7.

Table 7: Intra-sequence Similarity for Markov Property of Adaptation
Similarity (%)
a b c d
100% 100% 100% 100%
Execution of adaptation on the permuted sequences is one by one, and continue upon the previous state of RA.

The test demonstrate perfect Markovian behavior for the adaptation mechanism, but these simulated test cases cannot represent general test cases. However, the results demonstrate some level of reliable stability of the adaptation engine.

5 Conclusions

Aiming to address the problems caused by infeasible parts of MaOPs, this paper proposed a reference vector based MaOEA with interacting components. The algorithm TEEA tracks the current PF and uses such information to adapt the reference vectors which guides the evolution. The novelty in this paper mostly lies in the designed interaction scheme among the components and the method of adapting reference vectors. The effectiveness of the work has been validated thoroughly in the comparative experimental studies. It can be concluded that the proposed algorithm TEEA is effective in dealing with the MaOPs and particularly good at solving the problems with partially feasible objective spaces.

Acknowledgments

This work is supported by Mila (L’Institut québécois d’intelligence artificielle), the Scientific Computing Laboratory and Reasoning and Learning Laboratory, McGill University, the National Natural Science Foundation of China (61572104, 61103146, 61425002, 61751203), the Fundamental Research Funds for the Central Universities (DUT17JC04), and the Project of the Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University (93K172017K03).

References

References

  • Louafi et al. (2017) H. Louafi, S. Coulombe, M. Cheriet, Multi-objective optimization in dynamic content adaptation of slide documents, IEEE Transactions on Services Computing 10 (2017) 231–243.
  • Li et al. (2017) L. Li, L. Jiao, J. Zhao, R. Shang, M. Gong, Quantum-behaved discrete multi-objective particle swarm optimization for complex network clustering, Pattern Recognition 63 (2017) 1 – 14.
  • Ferreira et al. (2017) A. C. Ferreira, M. L. Nunes, J. C. Teixeira, L. A. Martins, S. F. Teixeira, S. A. Nebra, Design of a solar dish stirling cogeneration system: Application of a multi-objective optimization approach, Applied Thermal Engineering 123 (2017) 646 – 657.
  • Du et al. (2018) W. Du, Y. Tang, S. Y. S. Leung, L. Tong, A. V. Vasilakos, F. Qian, Robust Order Scheduling in the Discrete Manufacturing Industry: A Multiobjective Optimization Approach, IEEE Transactions on Industrial Informatics 14 (2018) 253–264.
  • Purshouse and Fleming (2007) R. C. Purshouse, P. J. Fleming, On the evolutionary optimization of many conflicting objectives, IEEE Transactions on Evolutionary Computation 11 (2007) 770–784.
  • Wang et al. (2017) H. Wang, Y. Jin, X. Yao, Diversity assessment in many-objective optimization, IEEE Transactions on Cybernetics 47 (2017) 1510–1522.
  • Chugh et al. (2018) T. Chugh, Y. Jin, K. Miettinen, J. Hakanen, K. Sindhya, A surrogate-assisted reference vector guided evolutionary algorithm for computationally expensive many-objective optimization, IEEE Transactions on Evolutionary Computation 22 (2018) 129–142.
  • Cheng et al. (2016) R. Cheng, Y. Jin, M. Olhofer, B. Sendhoff, A reference vector guided evolutionary algorithm for many-objective optimization, IEEE Transactions on Evolutionary Computation 20 (2016) 773–791.
  • Tian et al. (2017) Y. Tian, R. Cheng, X. Zhang, F. Cheng, Y. Jin, An indicator based multi-objective evolutionary algorithm with reference point adaptation for better versatility, IEEE Transactions on Evolutionary Computation PP (2017) 1–1.
  • Zhao et al. (2018) M. Zhao, H. Ge, H. Han, L. Sun, A many-objective evolutionary algorithm with fast clustering and reference point redistribution, in: IEEE Congress on Evolutionary Computation (CEC), 2018, pp. 1–6.
  • Ge et al. (2018) H. Ge, M. Zhao, L. Sun, Z. Wang, G. Tan, Q. Zhang, C. L. P. Chen, A many-objective evolutionary algorithm with two interacting processes: Cascade clustering and reference point incremental learning, IEEE Transactions on Evolutionary Computation (2018).
  • Qi et al. (2014) Y. Qi, X. Ma, F. Liu, L. Jiao, J. Sun, J. Wu, MOEA/D with adaptive weight adjustment 22 (2014) 231–264.
  • Wu et al. (2018) M. Wu, K. Li, S. Kwong, Q. Zhang, J. Zhang, Learning to decompose: A paradigm for decomposition-based multiobjective optimization, IEEE Transactions on Evolutionary Computation (2018) 1–1.
  • Jain and Deb (2014) H. Jain, K. Deb, An evolutionary many-objective optimization algorithm using reference-point based nondominated sorting approach, part II: Handling constraints and extending to an adaptive approach, IEEE Transactions on Evolutionary Computation 18 (2014) 602–622.
  • Wang et al. (2013) R. Wang, R. C. Purshouse, P. J. Fleming, Preference-inspired coevolutionary algorithms for many-objective optimization, IEEE Transactions on Evolutionary Computation 17 (2013) 474–494.
  • Praditwong and Yao (2006) K. Praditwong, X. Yao, A new multi-objective evolutionary optimisation algorithm: The two-archive algorithm, in: International Conference on Computational Intelligence and Security, volume 1, 2006, pp. 286–291.
  • Wang et al. (2015) H. Wang, L. Jiao, X. Yao, Two_Arch2: An improved Two-Archive algorithm for many-objective optimization 19 (2015) 524–541.
  • Cheng et al. (2017) R. Cheng, M. Li, Y. Tian, X. Xiang, X. Zhang, S. Yang, Y. Jin, X. Yao, Benchmark Functions for CEC’2018 Competition on Many-Objective Optimization, Technical Report, CERCIA, School of Computer Science, University of Birmingham Edgbaston, Birmingham B15 2TT, U.K., School of Computer Science and Technology, Anhui University Hefei 230039, China, School of Computer Science and Informatics, De Montfort University Leicester, LE1 9BH, U.K., Department of Computer Science, University of Surrey Guildford, Surrey, GU2 7XH, U.K., 2017.
  • Tian et al. (2017) Y. Tian, R. Cheng, X. Zhang, Y. Jin, PlatEMO: A MATLAB platform for evolutionary multi-objective optimization [educational forum], IEEE Computational Intelligence Magazine 12 (2017) 73–87.
  • Czyzżak and Jaszkiewicz (1998) P. Czyzżak, A. Jaszkiewicz, Pareto simulated annealing: A metaheuristic technique for multiple-objective combinatorial optimization, Journal of Multi-Criteria Decision Analysis 7 (1998) 34–47.
  • Deb et al. (2005) K. Deb, L. Thiele, M. Laumanns, E. Zitzler, Scalable test problems for evolutionary multiobjective optimization, in: Evolutionary Multiobjective Optimization: Theoretical Advances and Applications, 2005, pp. 105–145.
  • Yuan et al. (2018) J. Yuan, H. Liu, F. Gu, A cost value based evolutionary many-objective optimization algorithm with neighbor selection strategy, in: IEEE Congress on Evolutionary Computation (CEC), 2018, pp. 1–8.
  • Li et al. (2016) M. Li, S. Yang, X. Liu, Pareto or non-pareto: Bi-criterion evolution in multiobjective optimization, IEEE Transactions on Evolutionary Computation 20 (2016) 645–665.
  • Zitzler and Künzli (2004) E. Zitzler, S. Künzli, Indicator-based selection in multiobjective search, in: Parallel Problem Solving from Nature - PPSN VIII: 8th International Conference, Birmingham, UK, 2004, pp. 832–842.