Assignment – Social Media Data Extraction – Online
Through this assignment, you will scrape tweets from the FedEx US and FedEx Latin America, and analyze your results.
Directions to extract Twitter data are as follows:
Create a new twitter app (Links to an external site.)
https://developer.twitter.com/en/apps
(Generate and save your consumer and access key/secret for use in the python script).
When creating your app, you may use the following as an app description and app usage: “As part of an in-class training, I would like to scrape and analyze a set of tweets. The exercise will enable me to evaluate social media data and its power in deriving business insights.”
For the Website URL and Callback URL, you can use:
“https://test.com” (Links to an external site.)
The remaining fields on the app creation are optional
Once you create a Twitter App, click on the Keys and Tokens to access to your unique set of consumer_key, consumer_key, access_token, and access_token_secret (you will need these values for data extraction).这些是我自己的推特账号登上上面那个developer软件后,应该是有自己独一无二的一串
1.Download and install Python 3.7.3 (Links to an external site.)on your machine
2.Install pip (for automated package download and installation) get-pip (py file)
Hint: To install pip, make sure to save the get-pip file on your machine (C:\), and then run the code (double-click)
3. Install tweepy using pip using the following step.
Open the cmd prompt and run the following commands:
cd C:\Users\${username}\AppData\Local\Programs\Python\Python37
python -m pip install tweepy
• Extract the tweet range
• 我的范围是2600-2700 assigned to you (to prevent overlap).
老师发了一个sample code,但是需要改的部分我已经圈出来了!
Hints:
For this step, you may update and use the code provided in the file attached FedEx Twitter Scrapper.pyPreview the document
To edit the Python file, you can use the python editor (idle). Right-click on the .py file and select Edit with IDLE
First update the code to use your own unique set of consumer_key, consumer_key, access_key, and access_secret
Second, update the “keyword” to include your assigned tweet range.
5. Pre-process your dataset in preparation for analysis
Remove any extra characters
Filter out any tweets not needed for the analysis, along with justifications
Assignment deliverables:
Description of your dataset, along with any rules used for data pre-processing
Supporting code(只需要提取的这个代码和提取出来的文档!CSV)
Resulting dataset in comma-separated value (csv) or spreadsheet (i.e. Excel) file