Image from: http://www.kirkk.com/modularity/wp-content/uploads/2009/12/EncapsulatingDesign1.jpg
Example of Hill Climbing Application: Software Module Clustering (Problem Formulation)
Image from: http://www.turingfinance.com/wp-content/uploads/2015/05/Annealing.jpg
Copyright By PowCoder代写 加微信 powcoder
Hill Climbing Applications
Simple algorithm — not difficult to implement.
Could be attempted first to see if the retrieved solutions are good enough, before a more complex algorithm is investigated.
Hill-Climbing is applicable to any optimisation problem, but its success depends on the shape of the objective function for the problem instance in hands.
Applications
Hill-climbing has been successfully applied to software module clustering.
Software Module Clustering:
Software is composed of several units, which can be organised into modules.
As software evolves, modularisation tends to degrade.
Well modularised software is
easier to develop and maintain.
Image from: http://www.kirkk.com/modularity/wp-content/uploads/2009/12/EncapsulatingDesign1.jpg
Problem: find an allocation of units into modules that maximises the quality of modularisation.
Applying Hill-Climbing
(and Simulated Annealing)
We need to specify:
Optimisation problem formulation:
Design variable and search space Constraints
Objective function
Strategy to deal with constraints, e.g.:
Algorithm-specific operators:
Representation. Initialisation procedure. Neighbourhood operator.
Representation, initialisation and neighbourhood operators that ensure only Modification in the objective function.
feasible solutions to be generated.
Formulation Optimisation Problems
Design variables represent a candidate solution.
Design variables define the search space of candidate Objective function defines our goal.
Can be used to evaluate the quality of solutions. Function to be optimised (maximised or minimised).
[Optional] Solutions must satisfy certain constraints.
solutions.
Design Variable
Design variable: allocation of units into modules.
Consider that we have N units, identified by natural numbers in {1,2,…,N}.
This means that we have at most N modules.
• Our design variable is a list L of N modules, where each module Li, i ∈ {1,2,…,N}, is a set containing a minimum of 0 and a maximum of N units.
L1 = {} L2 = {1}
Module 1 Module 2
L3 = {2,4,5}
unit 2 unit 5
Design Variable Design variable: allocation of units into modules.
Consider that we have N units, identified by natural numbers in {1,2,…,N}.
This means that we have at most N modules.
• Our design variable is a list L of N modules, where each module Li, i ∈ {1,2,…,N}, is a set containing a minimum of 0 and a maximum of N units.
L3 = {2,4,5}
unit 2 unit 5
Search space: all possible allocations.
Constraints and Objective
Constraints: N/A
Objective function: quality of modularisation (to be maximised). How to compute quality?
What does good quality mean?
L1 = {} L2 = {1}
Module 1 Module 2
L3 = {2,4,5}
unit 2 unit 5
A unit can make use of (depend on) another unit — this information can be retrieved from the current source code being refactored.
Constraints and Objective
Constraints: N/A
Objective function: quality of modularisation (to be maximised). How to compute quality?
What does good quality mean?
L1 = {} L2 = {1}
Module 1 Module 2
L3 = {2,4,5}
unit 2 unit 5
Lots of connections inside a module (high cohesion) and few connections between modules (low coupling).
Quality of a Module Li Constraints: N/A
Objective function: quality of modularisation (to be maximised). How to compute quality?
What does good quality mean?
L3 = {2,4,5}
1/2 is used to split the
penalty of inter-edges across the two modules connected by that edge
unit 2 unit 5
Quality(Li) = (maximise)
#IntraEdgesi #IntraEdgesi + 1/2 * #InterEdgesi
Quality of a Module Li Constraints: N/A
Objective function: quality of modularisation (to be maximised). How to compute quality?
What does good quality mean?
L3 = {2,4,5}
The #intra_edges_i in the denominator is a normalisation factor.
unit 2 unit 5
Quality(Li) = (maximise)
#IntraEdgesi #IntraEdgesi + 1/2 * #InterEdgesi
Intra Edges
Objective function: quality of modularisation (to be maximised). How to compute quality?
What does good quality mean?
Constraints: N/A
L3 = {2,4,5}
unit 2 unit 5
size(Li) size(Li) #IntraEdgesi = ∑ ∑ DLij,Lij′
1, if unit a depends on unit b 0, otherwise (incl. diagonal)
Da,b = j=1 j′=1
Inter Edges
Objective function: quality of modularisation (to be maximised). How to compute quality?
What does good quality mean?
Constraints: N/A
size(Li) #InterEdgesi =
L3 = {2,4,5}
unit 2 unit 5
(DLij,Li′j′ + DLi′j′,Lij) j=1 i′∈{1,2,⋯,N}|i′≠i j′=1
Quality of a Module Li Constraints: N/A
Objective function: quality of modularisation (to be maximised). How to compute quality?
What does good quality mean?
L3 = {2,4,5}
This is the quality of a single module.
unit 2 unit 5
Quality(Li) = (maximise)
#IntraEdgesi #IntraEdgesi + 1/2 * #InterEdgesi
Quality of a Solution L Constraints: N/A
Objective function: quality of modularisation (to be maximised). How to compute quality?
What does good quality mean?
L1 = {} L2 = {1}
Module 1 Module 2
L3 = {2,4,5}
unit 2 unit 5
Quality(L) = sum of the qualities of the non-empty modules (maximise)
Quality of a Solution L Constraints: N/A
Objective function: quality of modularisation (to be maximised). How to compute quality?
What does good quality mean?
i ∈ {1,2,…,N} | Li ≠ {}
L3 = {2,4,5}
unit 2 unit 5
Quality(Li)
Quality(L) = (maximise)
Problem Formulation
Hill-Climbing (assuming maximisation)
1. current_solution = generate initial solution randomly
2. Repeat:
2.1 generate neighbour solutions (differ from current solution by a single element)
2.2 best_neighbour = get highest quality neighbour of current_solution
2.3 If quality(best_neighbour) <= quality(current_solution)
2.3.1 Return current_solution
2.4 current_solution = best_neighbour
Until a maximum number of iterations
Design variable —>
what is a candidate solution for us?
Problem Formulation
Hill-Climbing (assuming maximisation)
1. current_solution = generate initial solution randomly
2. Repeat:
2.1 generate neighbour solutions (differ from current solution by a single element)
2.2 best_neighbour = get highest quality neighbour of current_solution
2.3 If quality(best_neighbour) <= quality(current_solution)
2.3.1 Return current_solution
2.4 current_solution = best_neighbour
Until a maximum number of iterations
Design variable —>
what is a candidate solution for us?
Objective —> what is quality for us?
Are there any constraints that need to be satisfied?
Simulated Annealing would also require a problem formulation to be able to solve a problem.
Summary Software Module Clustering problem formulation.
Representation, initialisation and neighbourhood operators.
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com