Visual Clustering Analysis of Social Network (40%)
Social networks are ubiquitous. A fundamental problem related to these networks is the discovery of clusters or communities. Intuitively, a cluster is a collection of individuals with dense friendship patterns internally and sparse friendships externally.
The discovery of close-knit clusters in these networks is of fundamental and practical interest. There are many reasons to seek tightly-knit communities in networks, for instance, target marketing schemes can be designed based on clusters, and it has been claimed that terrorist cells can be identified.
This assignment requests a group of two students to analyze an organization’s email network (a type of social networks) through the data clustering and a clustered graph visualization.
Step 1: through data clustering, we can identify abnormal (implicit) network patterns that against the hierarchical structure of the organization,
Step 2: through a clustered graph visualization, we can visually read and quickly understand the data clustering output, including the abnormal network patterns.
The weight of this assignment is 40%.
Specification:
The following Figure shows the organization structure of TACME, sourced from http://www.tacme.com/corporate_structure.html
1|Page
A list of staff’s ID, Name and Position in TACME
1 Director CEO
3 Business Development Manager Business Support Manager
5 Business Control Manager Sales Department Leader
7 Product Department Leader Marketing Department Leader
9 Project Office Leader Professional Service Leader
11 QA Leader
Design Office Leader
13 Technical Support Office Leader Software Development Leader
15 Legal Office Leader Finance Office Leader
17 HR Office Leader
The Email communication detail in a particular month is shown below:
02
2
23
4
25
6
37
8
49
10
4 11
12
4 13
14
5 15
ID Name Position
0
James
Director
David
2
George
Ronald
4
John
Richard
6
Daniel
Kenneth
8
Anthony
Robert
10
Charles
Paul
12
Mark
Kevin
14
Edward
Joseph
16
2|Page
Michael
Jason
ID Emails per month Weight ID
0
5
1
1
6
1
1
5
1
25
2
2
36
2
53
3
3
150
4
213
212
5
3
298
5
345
6
4
123
4
5
4
453
7
156
4
4
278
5
300
5
5 78 3 16
5 256 5 17
6 145 4 8
9 34 2 10
6 78 3 7
7 139 4 8
9 134 4 11
9 546
9 145
10 222
7 12
4 14
5 12
9 23 2 13
10 256 5 11
10 190 4 13
10 56 3 14
11 112 4 13
15 88 3 16
16 238 5 17
16 15 2 6
16 54 3 8
16 23 2 11
16 13 2 14
Weight description:
11–50 2 3 101 – 200 4 5 301 – 400 6
11 78 3 12
12 98 3 14
15 128 4 17
17 5 1 7
16 23 2 7
16 18 2 9
16 41 2 13
16 27 2 10
Quantity Weight
<10
1
51–100
201 – 300
> 401 |
7 |
3|Page
General Requirement:
Students are required:
1) To draw (visualize) the original email network on the paper (or screen) with the satisfaction of the following Aesthetics Rules:
- Symmetrical Display,
- Minimization of Edge-Crossings and
- Maximization of Angular Resolution.
In addition, since each edge e in the graph is associated with a weight w(e), you need to map the w(e) to a graphical attribute, such as color, types of line, size or shapes, to enhance the readbility of the weight
2) To cluster this email network (or graph) into clustered structures by using Markov Clustering Algorithm. You need to produce two clustered structures 1) with the weight w(e), 2) without the weight w(e) .
3) Discuss the findings. If there is one (or more) abnormal network pattern(s) found, you need to describe them in details.
4) To draw (visualize) these two clustered graphs (one with w(e), another without w(e) ) on the paper (or screen). Using geometric rectangles (or circles) to bound the clusters in the drawing. Make sure that these regions are not overlapped. In addition, these drawings shall also satisfy the general graph drawing aesthetics.
4|Page