Title: Speed Dating – How to Win your Love
Abstract: We are interested in finding out what makes a successful dating experience. For this project, we plan to focus on a specific type of dating: speed dating. Making use of the dataset from Columbia business school together with our NLP analysis of twitter data, we expect to explore this question at a worldwide-level, a regional-level, and an individual level:
¡ñ Worldwide-level: How does speed dating success rate differ from region to region? Here ¡°region¡± means the city and country people come from originally. Maybe this is related to different personalities in different regions.
¡ñ Worldwide-level: Does the sentiment towards speed dating differ from region to region, and how does this relate to the participation rate in this project, say, it is expected that regions with more positive comments on this topic will have more people participating in the project. We would also like to explore how the sentiment relates to speed dating success rate.
Copyright By PowCoder代写 加微信 powcoder
¡ñ Regional-level: For the same region, across 3 waves of time (2002, 2003, 2004) , are there any differences in dating success rates? We hope to use some form of time-series analysis to achieve this.
¡ñ Individual-level: For each individual, what are the determinants of whether they can find a match or not? For example, are their ¡°declared goal for the evening¡± related to their ¡°find a match or not¡±? What about ¡°incomes¡±, ¡°regular dating rate¡±, ¡°frequency of going out¡±?
¡ñ Individual-level: We would like to perform an analysis on expectations vs. reality by comparing the perceived relative importance of personal attributes from a potential suitor, versus the actual importance rating. For instance, do females think males value attractiveness in a date more than intelligence or ambition, and compare that with how males actually rate the importance of each of these attributes.
Techniques: ggplot2, ggmap, interaction, Shiny, NLP text mining
Data Description:
¡ñ Dateset: https://data.world/annavmontoya/speed-dating-experiment
¡ð The data is from speed dating events of 550 people from 2002-2004.
¡ð During the events, the attendees had a 4 minute ¡°first date¡± with every other
participate of the opposite sex. When their 4 minutes ends, attendees were asked if they would like to see their date partner again, we label ¡°yes¡± as a successful dating.
¡ð The dataset also includes data of demographics, dating habits, self-perception across key attribute, belief on what others find valuable in a partner and lifestyle information.
¡ñ Twitter Data
¡ð We also plan to web scrape sentiment data from Tweets with keywords such as ¡®speed dating¡±, ¡°dating¡± etc. queried by region. This will provide us some additional information on dating experience in different places. With the popularity of Twitter, this dataset will complements the one above with a larger sample to study. Since we cannot follow the exact experiment subjects as mentioned in the above dataset, this analysis will focus on the aggregate level result, aiming to identify regional specific trends and likes from Tweets.
¡ð The final visualization will be an interactive GG map with top 10 most frequent attributes scrapped from Tweets by state or country, displayed in Word Clouds.
Visualizations:
¡ñ Map: We would use ggmap to illustrate speed dating success rate differences between regions across the world. We plan to use depths of color to represent different success rate.
¡ñ Word cloud: For different regions, we could get the most popular words about speed dating through web scraping and plot them in a single chart to see the comparisons.
¡ñ Line chart: (time series analysis)For the same region, we hope to illustrate whether there is changes in dating success rate over time with different groups of individuals (group by some attributes).
¡ñ Bar chart: For different groups of individuals (race, gender, age, income level), we hope to compare the dating success rate to see whether there is substantial divergence across different groups of people.
¡ñ Point chart: For each individual, we would plot the average ranking of the attributes that male thinks female thinks how important it is in male, along with the average ranking of the attributes that female actually thinks how important it is in male. (eg. attractive, sincere, intelligent, fun, ambitious and shared interest). Another plotting option for this is a side by side bar chart.
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com