程序代写代做 C kernel js Explore

Explore

Demo

Fork
Sign in
👋 Welcome. This is live code! Click the left margin to view or edit.

ava-yy

Published
Mar 10

7
Files

Assignment 3
In this assignment we will visually analyze the outputs of a deep neural network, namely a model for generating images. A generative adversarial network has been trained to learn the distribution of a dataset of images, and you will visualize the outputs of the generator to help understand different aspects of the network. “Output” here means not only the images meant to mimic the distribution implied by the dataset, but also intermediary outputs produced by the network.
Provided are the weights of a neural network, and a means of generating an image using the network given some input. In particular, the network can be described as follows:
• The input is a vector z ∈ R 40   \mathbf{z} \in \mathbb{R}^{40}  z∈R
• 40 
• . Typically, this vector is sampled from a multivariate normal distribution with zero mean and identity covariance.
• The input is fed through a linear layer f 1   f_1  f
• 1 
•
•  
• , parameterized as a matrix M 0  ∈ R 25600 × 40    M_0 \in \mathbb{R}^{25600 \times 40}  M
• 0 
•
•  
• ∈R
• 25600×40 
• , followed by applying a rectified linear unit activation (ReLU). The output, a 25,600-dimensional vector, is then reshaped into a 3-tensor of shape R 8 × 8 × 400    \mathbb{R}^{8 \times 8 \times 400}  R
• 8×8×400 
• , which you should think of as an 8 × 8  8 \times 8  8×8 image with 400 feature maps, e.g. a 400-channel image.
• Each layer after this is a deconvolutional layer, which upsamples the image’s spatial resolution by a factor of 2, convolving the input with a set of filters to produce another multi-channel image. Specifically, the layers are defined as follows, with a ReLU applied after each layer:
◦ f 2  : R 8 × 8 × 400   → R 16 × 16 × 200    f_2 : \mathbb{R}^{8 \times 8 \times 400} \rightarrow \mathbb{R}^{16 \times 16 \times 200}  f
◦ 2 
◦
◦  
◦ :R
◦ 8×8×400 
◦ →R
◦ 16×16×200 
◦ f 3  : R 16 × 16 × 200   → R 32 × 32 × 100    f_3 : \mathbb{R}^{16 \times 16 \times 200} \rightarrow \mathbb{R}^{32 \times 32 \times 100}  f
◦ 3 
◦
◦  
◦ :R
◦ 16×16×200 
◦ →R
◦ 32×32×100 
◦ f 4  : R 32 × 32 × 100   → R 64 × 64 × 50    f_4 : \mathbb{R}^{32 \times 32 \times 100} \rightarrow \mathbb{R}^{64 \times 64 \times 50}  f
◦ 4 
◦
◦  
◦ :R
◦ 32×32×100 
◦ →R
◦ 64×64×50 
◦ f 5  : R 64 × 64 × 50   → R 128 × 128 × 25    f_5 : \mathbb{R}^{64 \times 64 \times 50} \rightarrow \mathbb{R}^{128 \times 128 \times 25}  f
◦ 5 
◦
◦  
◦ :R
◦ 64×64×50 
◦ →R
◦ 128×128×25 
• The last layer produces the actual image, e.g. a 3-channel RGB image, followed by a tanh ⁡  \tanh  tanh activation which maps each output pixel to the range of [ − 1 , 1 ]  [-1,1]  [−1,1]:
◦ f 6  : R 128 × 128 × 25   → R 128 × 128 × 3    f_6 : \mathbb{R}^{128 \times 128 \times 25} \rightarrow \mathbb{R}^{128 \times 128 \times 3}  f
◦ 6 
◦
◦  
◦ :R
◦ 128×128×25 
◦ →R
◦ 128×128×3 
The dataset is a collection of cat images. Provided for you is code to display an array of images produced by the network.
Random Cats(ish)!
fresh_image =

Array(128) [Array(128), Array(128), Array(128), Array(128), Array(128), Array(128), Array(128), Array(128), Array(128), Array(128), Array(128), Array(128), Array(128), Array(128), Array(128), Array(128), Array(128), Array(128), Array(128), Array(128), …]

(you may want to adjust the frequency at which images are sampled for performance purposes)
The objectives for this assignment are twofold:
• First, we would like to perform some analysis on the outputs produced at different layers in the network, given a sampling of the input space: a set of random (normally-distributed) samples provided as input, and the corresponding ouputs produced at different layers.
• Second, we would like to visually encode the outputs of the analysis, and provide interactions to aid the user in their exploration.
n = 20
generate_sample = ƒ()
cur_sample =

Array(6) [Array(3), Array(3), Array(3), Array(3), Array(3), Array(3)]
max_spacial_pooling = ƒ(input_samples)
cur_sample_layers_maxpool =

Array(20) [Array(7), Array(7), Array(7), Array(7), Array(7), Array(7), Array(7), Array(7), Array(7), Array(7), Array(7), Array(7), Array(7), Array(7), Array(7), Array(7), Array(7), Array(7), Array(7), Array(7)]
get_data_for_layer = ƒ(dataset, which_layer)
num_of_layers = 7
row_normalize = ƒ(input_samples)
cur_sample_layers_row_normalize =

Array(7) [Array(20), Array(20), Array(20), Array(20), Array(20), Array(20), Array(20)]
num_of_clusters = 4
pick_n_rand_num = ƒ(num)
find_closest = ƒ(cur, clusters)
k_means_clustering = ƒ(input)
compute_center = ƒ(input, assignments, picked_index)
reducer = ƒ(accumulator, currentValue)
perform_k_means = ƒ(input_samples)
n_c =

Array(6) [40, 20, 10, 5, 3, 1]
cur_sample_layers_kmeans =

Array(7) [Object, Object, Object, Object, Object, Object, Object]
generate_array_one = ƒ(input)
array_one =

Array(140) [Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, …]
get_array_two = ƒ(input_samples, n_c)
array_two =

Array(1580) [Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, Object, …]
Analysis of Neural Network Layers
Here we will use the approach of Summit to extract features from layers of the neural network. We will perform the following for a given number of
n
n
n input samples:
• For each layer’s output, we will perform max spatial pooling to reduce the input tensor of size R w × w × c    \mathbb{R}^{w \times w \times c}  R
• w×w×c 
• to a vector of size R c   \mathbb{R}^c  R
• c 
• . Hence, for a given layer l  l  l with c  c  c channels, this will provide a feature matrix F l  ∈ R n × c    F_l \in \mathbb{R}^{n \times c}  F
• l 
•
•  
• ∈R
• n×c 
• . We will also store the resulting RGB image output for each of our samples (but do not extract features directly from the RGB images).
• We will then row-normalize the matrix, where we will divide each row of F l   F_l  F
• l 
•
•  
• by its sum, producing a new matrix F ˉ  l  ∈ R n × c    \bar{F}_l \in \mathbb{R}^{n \times c} 
• F ˉ 
• l 
•
•  
• ∈R
• n×c 
• where each row of this matrix sums to 1.
• Here we depart from Summit, as we do not have images that are associated with categories. We will take our matrix F ˉ  l  ∈ R n × c    \bar{F}_l \in \mathbb{R}^{n \times c} 
• F ˉ 
• l 
•
•  
• ∈R
• n×c 
• , and perform k-means with respect to the rows of this matrix. This will group our samples into different clusters, per-layer.
• To gain some insight on the clustering, we will store, for each sample, its distance to the cluster center.
• Last, to help us understand the relationship between clusters, we will gather the top n c   n_c  n
• c 
•
•  
• channels for a given cluster’s center.
At the end of the analysis, we will derive 2 new arrays. One array should contain basic information about the
n
n
n samples that we’ve generated in the network:
• layer: the layer that corresponds to the feature
• cluster: the cluster id for the given sample
• sample: an id to uniquely identify the sample
• center_dist: the distance to the center of the cluster
The other array should contain more detailed information about the samples, pertaining to their pooled activations (herein simply referred to as “activations”) with respect to the top
n
c
n_c
n
c

highest-activated cluster channels:
• layer: the layer that corresponds to the feature
• cluster: the cluster id for the given sample
• channel: the channel id, one of the top n c   n_c  n
• c 
•
•  
• channels for the cluster
• sample: an id to uniquely identify the sample
• activation: the activation at channel for sample.
Visual Analysis
Given the data produced in the analysis, we will then design a visualization to help us understand the results of the clustering. This will help us comprehend what cat-based features the network focuses on to synthesize images. There are three main modes of visual analysis we will consider:
• Intra-cluster analysis: what do the images within a cluster look like, and what can we say about the images in regards to their (feature) distance to the cluster center?
• Inter-cluster analysis: within a given layer, what are the similarities and differences between clusters in terms of their highest-activated channels?
• Inter-layer analysis: for a given cluster at a given layer, how are these images distributed across clusters in other layers?
To this end, we will develop a small-multiples unit visualization to visually encode clusters across layers. Further, we will use dot plots to visually encode individual sample activations. For reference, the visualization should look something like this:

CATS
More specifically, the visualization should be designed as follows:
• Facet layer across columns
• Facet cluster across rows
• Within a cell, create a single square for each sample. The quantitative center_dist should be mapped to color, and the squares should be ordered based on center_dist – the square with smallest distance should be positioned at the upper left, the square with largest distance should be positioned at the lower right, and squares should be filled in going down, then right. You should determine the largest width/height of a square that will permit the tightest packing of squares across all cells (all layers and clusters). The color scale should be computed per-layer, rather than across all layers.
• Also within a cell, to the left of the square unit visualization, the activation information should be plotted – one dot plot for each channel, and where the activation value will be mapped to the y-axis. Similarly, the scale used for the y-axis should be per-layer, and shared across clusters.
Given these visual encodings, the visualization should support the following interactions:
• Brushing squares: a 2D brush should be associated with each region that the square marks reside. When a user performs a brush, all squares that intersect the brush’s shape should be determined, and their visual appearance should be modified to indicate a selection. Furthermore, squares across other layers that correspond to the brushed samples should also have their visual appearance modified. This will help us understand relationships across layers. In addition, at the end of a brush, we will want to show the images that correspond to the selected samples. This can be easily achieved using views, treating the plot as viewof, and then within the end brush event, setting the cell’s value to the list of corresponding (sorted by distance to center) images and emitting an update. A cell immediately following the plot may then reference the visualization’s cell directly, and simply invoke the provided image_grid function.
• Linked highlighting of channels: when a user clicks on any circle corresponding to a sample’s channel activation, each dot plot for this cluster – one for each channel – should be highlighted with a unique color. Furthermore, all dot plots corresponding to these channels across clusters should also be highlighted, using the channel-matched colors to indicate linking.
Use the visualization for insight
You should provide some discussion on the insights that you find in using the visualization. There are a lot of interesting things to discover! You should describe how your visualization helped lead you to your conclusions.
I encourage experimentation in your analysis. In particular, max-pooling to derive features is but one way to aggregate. You may consider other ways to pool, or even not pooling at all! You may also want to modify how random samples are generated – for instance, you may want to store a given cluster at a given layer, and then when sampling, only accept samples that are sufficiently close to the cluster center.
Tips
Start small, with fake data
You may want to start off sampling thousands of images and visually encoding all of it. However, the performance of the deep network is highly dependent on your machine. So to begin, start small, say just 20 samples, a couple clusters, and just some layers. This will help you debug. You can then expand once you are confident that your analysis, and visualization, is correct.
On the visualization side, I recommend mocking up data first, since it could take some time to perform the analysis and provide the real data that you use for the visualization. Fake data will help you to determine the specific requirements you will need for creating the visualization, e.g. what data is necessary, and more importantly, how is it structured. Consider using d3.nest to hierarchically organize your data.
tensorflow.js
You will want to use tensorflow.js for much of the analysis. Some things to be aware of:
• Tensors are immutable. This means you cannot directly assign a given index set, or slice of a tensor, to a given value(s). So, rather than converting back and forth between JS Arrays and TF tensors, you should see how to accomplish what you intend to do solely with TF tensors. See the API for more details, and specifically, you might find the following functions useful:
◦ reshape for reshaping the size of tensor
◦ squeeze for removing a dimension(s) of the tensor if a dimension has size 1
◦ concat concatenate a sequence of tensors into one tensor, this is useful if you find that you need to derive a new tensor, but you cannot use TF’s functions
◦ gather access slices of the tensor along given dimensions (gives a full slice)
◦ slice a more fine-grained way of performing tensor slicing
◦ topk grab the top-k values, and corresponding indices, along the last dimension of your tensor
◦ squaredDifference computes entry-wise squared difference between 2 tensors
◦ max useful for max-pooling
• Tensors need to be freed from memory. TF does not automatically perform memory management, so every time you create a tensor, or derive a new tensor from a TF function, you need to free it from memory. You can do so explicitly via dispose, but a more handy way of doing this is through tidy.
D3
You may find the following D3 functions useful:
• each: a function of a Selection, this will allow you to iterate over every element in the selection, where D3 will populate each element’s data for you in the function you specify.
• filter: a function of a Selection, you specify a function that will determine whether or not an element in the selection should be filtered out, where D3 will populate the element’s data for a data-driven determination. This will return a new Selection of those elements that passed your test.
• raise and lower: helpful for controlling the layering of elements.
• Local variables: you may find this useful for handling child/parent element relationships.
Model
Layer 1: Fully Connected, in: 40-dimensional vector, out: 400-channel 8×8 image
layer1_mat =

t {kept: false, isDisposedInternal: false, shape: Array(2), dtype: “float32”, size: 1024000, strides: Array(1), dataId: Object, id: 9, rankType: “2”, scopeId: 8}
apply_layer1 = ƒ(in_vector)
Layer 2: Transposed Convolution, in: 400-channel 8×8 image, out: 200-channel 16×16 image
layer2_filters =

t {kept: false, isDisposedInternal: false, shape: Array(4), dtype: “float32”, size: 1280000, strides: Array(3), dataId: Object, id: 11, rankType: “4”, scopeId: 10}
apply_layer2 = ƒ(in_tensor)
Layer 3: Transposed Convolution, in: 200-channel 16×16 image, out: 100-channel 32×32 image
layer3_filters =

t {kept: false, isDisposedInternal: false, shape: Array(4), dtype: “float32”, size: 320000, strides: Array(3), dataId: Object, id: 7, rankType: “4”, scopeId: 6}
apply_layer3 = ƒ(in_tensor)
Layer 4: Transposed Convolution, in: 100-channel 32×32 image, out: 50-channel 64×64 image
layer4_filters =

t {kept: false, isDisposedInternal: false, shape: Array(4), dtype: “float32”, size: 80000, strides: Array(3), dataId: Object, id: 5, rankType: “4”, scopeId: 4}
apply_layer4 = ƒ(in_tensor)
Layer 5: Transposed Convolution, in: 50-channel 64×64 image, out: 25-channel 128×128 image
layer5_filters =

t {kept: false, isDisposedInternal: false, shape: Array(4), dtype: “float32”, size: 20000, strides: Array(3), dataId: Object, id: 3, rankType: “4”, scopeId: 2}
apply_layer5 = ƒ(in_tensor)
Layer 6: Convolution, in: 25-channel 128×128 image, out: RGB 128×128 image
layer6_filters =

t {kept: false, isDisposedInternal: false, shape: Array(4), dtype: “float32”, size: 675, strides: Array(3), dataId: Object, id: 1, rankType: “4”, scopeId: 0}
apply_layer6 = ƒ(in_tensor)
Utils
Use this function to plot a list of images. Note: each image in all_images is expected to be in the range of [0,1].
image_grid = ƒ(all_images, images_per_col)
This function will generate a random image. You should inspect this function to see how to use the individual layers of the network.
random_image = ƒ()

Object {event: null, format: ƒ(t), formatPrefix: ƒ(t, n), timeFormat: ƒ(t), timeParse: ƒ(t), utcFormat: ƒ(t), utcParse: ƒ(t), FormatSpecifier: ƒ(t), active: ƒ(t, n), arc: ƒ(), area: ƒ(), areaRadial: ƒ(), ascending: ƒ(t, n), autoType: ƒ(t), axisBottom: ƒ(t), axisLeft: ƒ(t), axisRight: ƒ(t), axisTop: ƒ(t), bisect: ƒ(n, e, r, i), bisectLeft: ƒ(n, e, r, i), …}
tf =

Object {ENV: t, Rank: Object, Reduction: Object, data: Object, version: Object, AdadeltaOptimizer: ƒ(e, n, r), AdagradOptimizer: ƒ(e, n), AdamOptimizer: ƒ(e, n, r, a), AdamaxOptimizer: ƒ(e, n, r, a, i), DataStorage: ƒ(t, e), Environment: ƒ(t), KernelBackend: ƒ(), MomentumOptimizer: ƒ(e, n, r), Optimizer: ƒ(), RMSPropOptimizer: ƒ(e, n, r, a, i), SGDOptimizer: ƒ(e), Tensor: ƒ(t, e, n, r), TensorBuffer: ƒ(t, e, n), Variable: ƒ(e, n, r, a), abs: ƒ(), …}
Assessment
Data Processing (45%)
• Ability to draw samples from the generative model, and extract per-layer features via pooling. (20%)
• Perform k-means with respect to each layer. (15%)
• Process clustering results: extract the highest-activated channels per-cluster, organize data for downstream processing (e.g. nest the data). (10%)
Visualization (45%)
• Unit visualization (e.g. the squares) – per layer, per cluster. (10%)
• Dot plot – per layer, per cluster. (10%)
• Linked brushing of squares. (15%)
• Linked highlighting of per-cluster channels. (10%)
Analysis (10%)
Description of insights found using your visualization, detailing how you used your visualization to discover various features of the generative model.

// define the number of samples we want to sample
n = 20

xxxxxxxxxx

cur_sample = generate_sample()

xxxxxxxxxx

layers_info = [[400, 8, 8], [200, 16, 16], [100, 32, 32], [50, 64, 64], [25, 128, 128], [3, 128, 128]]

xxxxxxxxxx

cur_sample_layers_maxpool = max_spacial_pooling(cur_sample)

xxxxxxxxxx

num_of_layers = 7

xxxxxxxxxx

cur_sample_layers_row_normalize = row_normalize(cur_sample_layers_maxpool)

xxxxxxxxxx

num_of_clusters = 4

xxxxxxxxxx

k_means_clustering = (input) => {
let res = tf.tidy(() => {
// random initialize k cluster
let picked_indices = pick_n_rand_num(num_of_clusters)
let cur_clusters = []
for (var i = 0; i < picked_indices.length; i++) { cur_clusters.push(input[picked_indices[i]]) } let assignments = [] let assigned_index = [] // do k-means for 5 iterations for (var j = 0; j < 5; j++) { // perform cluster assignments assignments = [] assigned_index = [] for (var i = 0; i < n; i++) { assignments.push(picked_indices[find_closest(input[i], cur_clusters)]) assigned_index.push(find_closest(input[i], cur_clusters)) } // compute new centroids for (var i = 0; i < picked_indices.length; i++) { cur_clusters[i] = compute_center(input, assignments, picked_indices[i]) } } let distances = [] for (var i = 0; i < n; i++) { let a = tf.tensor1d(cur_clusters[assigned_index[i]]) let b = tf.tensor1d(input[i]) distances.push(a.squaredDifference(b).sum().arraySync()) } return {'cur_clusters': cur_clusters, 'distances': distances, 'assigned_clusters': assigned_index, 'samples': input} }) return res } xxxxxxxxxx perform_k_means = (input_samples) => {
let sample_after_kmeans = tf.tidy(() => {
let res = []

for (var i = 0; i < num_of_layers; i++) { let cur_layer = input_samples[i] let kmeans_res = k_means_clustering(cur_layer) // let cur_layer_res = [] // for (var j = 0; j < n; j++) { // // let cur_row = tf.tensor1d(cur_layer[j]) // // let cur_sum = tf.scalar(tf.sum(cur_row).arraySync()) // // cur_layer_res.push(cur_row.div(cur_sum).arraySync()) // } res.push(kmeans_res) } return res }) // console.log(sample_after_kmeans) return sample_after_kmeans } xxxxxxxxxx cur_sample_layers_kmeans = perform_k_means(cur_sample_layers_row_normalize) xxxxxxxxxx generate_array_one = (input) => {
let res = []
for (var i = 0; i < n; i++) { for (var j = 1; j <= num_of_layers; j++) { res.push({'layer': j, 'cluster': input[j-1].assigned_clusters[i], 'sample': i, 'center_dist': input[j-1].distances[i]}) } } return res } xxxxxxxxxx array_one = generate_array_one(cur_sample_layers_kmeans) xxxxxxxxxx get_array_two = (input_samples, n_c) => {
let sample_after_topk = tf.tidy(() => {
let res = []
for (var j = 0; j < n; j++) { for (var i = 1; i <= num_of_layers; i++) { let cur_layer = input_samples[i - 1] let cur_clusters = cur_layer.cur_clusters[input_samples[i-1].assigned_clusters[i]] let {topk, indices} = tf.topk(tf.tensor1d(cur_clusters), n_c[i - 1]) let cur_indices = indices.arraySync() for (var k = 0; k < n_c[i-1]; k++) { res.push({'layer': i, 'cluster': input_samples[i-1].assigned_clusters[i], 'channel': cur_indices[k], 'sample': j, 'activation': input_samples[i-1].samples[j][cur_indices[k]]}) } } } return res }) return sample_after_topk } xxxxxxxxxx array_two = get_array_two(cur_sample_layers_kmeans, n_c) xxxxxxxxxx random_image = () => {
let out_img = tf.tidy(() => {
let rand_vector = tf.randomNormal([1,40])
let out_1 = apply_layer1(rand_vector)
let out_2 = apply_layer2(out_1)
let out_3 = apply_layer3(out_2)
let out_4 = apply_layer4(out_3)
let out_5 = apply_layer5(out_4)
let out_6 = apply_layer6(out_5)
return tf.squeeze(tf.scalar(.5).mul(out_6.add(tf.scalar(1))))
})
let the_img = out_img.arraySync()
tf.dispose(out_img)
return the_img
}

xxxxxxxxxx

random_image()

Related Posts