2
2.1 (Breadth First Search, BFS)
G s ππ[i] i π[s] = s jπ[j] = −1
算
Algorithm 1 Serial BFS
1:
2: 3: 4: 5: 6: 7: 8: 9:
10: 11: 12: 13:
forv∈V do
π [v] ← −1 end for
π[s]←s
F ← {s} whileF ̸=∅do
G←∅
for u, v such that u ∈ F, (u, v) ∈ E do
π [v] ← u
insert (v) → G end for
F←G end while
for 能 2.2 (Single Source Shortest Path, SSSP)
G s πdπ[i] i d[i] i π[s] = s d[s] = 0 jπ[j] = −1d[j]
1
-Bellman-Ford, BF算算Shortest Path Faster Algorithm, SPFA Djikstra算实算 F 算计算 for 次 H
Algorithm 2 Serial SSSP
1: 2: 3: 4: 5: 6: 7: 8: 9:
10: 11: 12: 13: 14: 15: 16: 17: 18: 19:
forv∈V do π [v] ← −1
d [v] ← inf end for
π[s]←s d[s]←0
F ← {s} whileF ̸=∅do
G←∅
H ← selectH
for u, v such that u ∈ H, (u, v) ∈ E do
if d[v] > d[u]+w[u,v] then π [v] ← u
d [v] ← d [u] + w [u, v]
insert (v) → G end if
end for
F ←(F\H)∪G end while
▷ BF:H = V , SPFA:H = F, Djikstra:H = {u|d[u] = minv∈F d[v]}
▷
BFS 1 SSSP BFS 1 SSSP 算第 10 SPFA 算 selectH BFS F SPFA 算 BFS 算
2.3
graph-sequential.c实计算算graph- load-balance.c 实 实 BFS SSSP graph-gpu-nvgraph.cu nvGraph BFS 能
1
2/7
v_pos
e_dst
e_weight
(a) 0123456
(b)
0
3
6
9
12
16
18
012345 0
1 2 3 4 5
1
4
5
0
2
4
1
3
4
2
4
5
0
1
2
3
0
3
*
1
2
3
1
4
5
4
6
7
6
8
9
2
5
7
8
3
9
*
(c) 1:
ve 32 v_pose_dst v_pos v+1e_dst e v_pos[0]=0v_pos[v]=e e_dst[v_pos[i]] e_dst[v_pos[i+1]]第 i e_weight e e_dst
0
3
6
9
1
4
5
0
2
4
1
3
4
*
1
2
3
1
4
5
4
6
7
*
012345 0
0123
v_pos e_dst
e_weight
123 145 467
689 2578
39
1 3456
2 3 4 5
v_pos
e_dst
e_weight
(b) 2:
123 145 467
689 2578
39
9
12
16
18
2
4
5
0
1
2
3
0
3
*
6
8
9
2
5
7
8
3
9
*
(a)
3/7
local_voffset_v offset_v off- set_v+local_v preprocesspreprocess graph-load-balance.c2 GPU
data run.sh run_gpu.sh srun -N (计算) -n () () () srun -p gpu () ()
1. make stat.csv计
2. 实 local_v offset_v 性 能
3. malloc cudaMalloc cpu_free gpu_free
4. GPU GPU算
CPU
5. BFS SSSP 次次
trivial
3
1. CPU 实
2. GPU 实
3. 实 1 2能 GPU 4. 作
4
1. 计算性能 GPU 计算 GPU 实 性能 data 20 MTEPS3 计算次 repetitions 64
2. C/C++ 实 C++ STL 能能 nvGraph /算 graph-*
3. CPU MPI+Threading 计算Flat MPI
4. 性能
5. 20 40 计 高算
4/7
Algorithm 3 Atomic Update
1: 2: 3: 4: 5: 6: 7: 8: 9:
10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21:
procedure non-atomic(ref) old ← load (ref )
if cond (old) then
value ← compute (old)
store (ref, value) end if
end procedure procedure atomic(ref)
success ← f alse
old ← load (ref ) while not success do
if cond (old) then
value ← compute (old)
new ← atomic-compare-and-swap (ref, old, value)
success ← (old = new)
old ← new else
success ← true end if
end while end procedure
▷ old ̸= new
7/7