MSBAPM Research Project – Dr. Ramesh Shankar (Updated – 9/16/2018)
On March 11th, 2011, StackOverflow made a significant change to their leaderboard. They now show users with top reputation gained in the past week, rather than all-time high reputation users. The rationale from Jeff Atwood, senior StackOverflow executive, was (in this blog post: https://stackoverflow.blog/2011/03/01/redesigned-users-page/) that they wanted to make the users- leaderboard much more dynamic and more interesting as a destination. They also wanted to showcase the efforts of any active users, not just the all-time top users. This shows that the company wanted to encourage regular users to contribute more.
A causal question of interest to the company, and to designers of leaderboards, is: did this change cause newer users to put in more effort? After all, even before the change, people could see the weekly top users, it was just not the default display. By converting this to a default, however, the website was making it easier for regular users to gain visibility (there is one less link to click, to view them now).
Some things we might expect to see, after the change compared to before the change:
➢ Regular users might attract more viewers to their profile
➢ Regular users might put in more effort, so that more regular users show up on the weekly
leaderboard consecutive weeks
➢ Regular users might put in more effort the week after they show up on the weekly leaderboard
– this may be true to some extent even if a regular user just showed up on the leaderboard – etc.
What is the impact of recognition on effort? Does recognition lead to more or less effort in subsequent period? How does this vary with time?
- – Impact of showing up on leaderboard on subsequent performance o Weeklyleaderboard
o Monthlyleaderboard
- – How long does impact last?
- – Is the impact positive (more effort) or negative (less effort)?
- – Does it result in higher or lower quality?
- – Does the user become more risk averse (sticking to core areas) or more adventurous (venturing
out to newer areas)?
Your goal:
Retrieve and analyze data from https://data.stackexchange.com/ to answer the above questions. Prepare a 10-minute presentation of your findings.
*Updates* (also see next page)
Data schema:
https://meta.stackexchange.com/questions/2677/database-schema-documentation-for-the-public- data-dump-and-sede
You may need to compute the reputation points earned by candidates in particular time periods (week, month or year). To compute points for a particular user, you can get (count of number of upvotes their questions or answers got) * (10 points per answer-upvote or 5 points for question-upvote) + (count of number of downvotes their questions or answers got) * (-2 points per downvote – either for question or answer).
For a count of number of upvotes or downvotes, you need to join the following three tables:
- Users table with Posts table on Users.id = Posts.OwnerUserId
- Posts table with Votes table on Posts.id = Votes.PostId