R语言代写

Econ 5121 Problem Set 3 (Due on Dec 5 (section C) and Dec 6 (section B))

We encourage team collaborations in this problem set. Each team can consist of at most 5 students and each team only submit one copy of answers. All answers should be machine-typed. The top 3 teams which provide the most precise predictions will be awarded with course bounce points. The winning teams have to prepare a 10 minutes presentation (with power points shown on screen) in the tutorial to show the class how they specify the model and how they predict.

I. Predicting Samsung NoteII adoption (download the data sets “adoption” and “prediction” from the course blackboard system)
You are given a dataset of customers from a telecommunication company in 2012. There are 2,621 customers who belongs to 20 groups. Using this data set, we are interested in knowing what factors affect customer’s adoption on Samsung cellphone, which is a binary variable with values 1 (adopt) or 0 (not adopt) and we denote it by Y . The data set provides customer information including age, gender dummy (1 for man and 0 for woman), and the dummy of smartphone user (prior to the release of Samsung Note II). Other than the above customer information, you are also given the communication network constructed from the call detail record for each group. The communication network takes the form of an adjacency matrix (W). Each element of W, Wij equals to one if customers i and j are connected and zero otherwise.

  1. Please compute the group size, network density, and network clustering coefficient of each group and report which group has the highest density and which group has the highest clustering coefficient.
  2. Please use the regression analysis to comment on whether the group size, network den- sity, and network clustering coefficient have impacts on the number of Samsung Note II adoption in each group or not.
  3. Please pick up one group and use the R package “igraph” to visualize this communication network. Use a different color to denote nodes who adopt Samsung Note II and use different node size to reflect its degree centrality.
  4. You are suggested to incorporate network statistics at individual level (e.g., degree centrality) or global level (e.g., density) into your Logit model to study customer’s adoption of Samsung NoteII. Please specify your model and report your estimation results (including coefficients and standard errors)

1

5. Based on your estimated model, predict Samsung Note II adoptions for another 20 groups (indexed from 21 to 40) in the data set “prediction.” In the data set “prediction,” there are information of customer characteristics and networks, but no information of Samsung Note II adoption. According to your predictions, please rank the top 10 groups according to the number of predicted adoptions in each group.

2