watch If you are a regular Twitter user, you’ll find yourself wanting to collect all your tweets for whatever obscure reason. I was looking to do some sort of sentiment analysis on a large dataset – it made sense to go to Twitter … but you can only get 1500 tweets at a time with Twitter’s API.
enter site After looking around forums I couldn’t find a reasonable solution. So one way to tackle this is to build up a database over time – just store the tweets you want locally. Since I was going to use this with R, I wanted to collect data with R.
To do this, we can use the twitteR package to communicate with Twitter and save our data.
Once you have the script in place, you can run a cron job to execute the script every day, week, hour or whatever you see fit.
We will be using 3 libraries in our R Script. Lets load them into our environment:
| library(twitteR) library(RCurl) library(ROAuth) | 
We will need to set up an SSL certificate (especially if you are on Windows). We do this using the following line of code:
| options(RCurlOptions = list(cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl"))) | 
Lets set up our API variables to connect to twitter.
| reqURL <- "https://api.twitter.com/oauth/request_token" accessURL <- "https://api.twitter.com/oauth/access_token" authURL <- "http://api.twitter.com/oauth/authorize" | 
You will need to get a go here api key and https://gardenswhisper.com/cactus-etiolation/ api secret from Twitter’s developer site https://apps.twitter.com/app/new – make sure you give read, write, and direct message permissions to your newly created application.
| apiKey <- "YOUR-KEY" apiSecret <- "YOUR-SECRET" | 
Lets put everything together and try an authorize our application.
| twitCred <- OAuthFactory$new( consumerKey=apiKey, consumerSecret=apiSecret, requestURL=reqURL, accessURL=accessURL, authURL=authURL) | 
Now lets connect by doing a handshake …
| twitCred$handshake() | 
You will get a message to follow a link and get a confirmation code to input into your R console. The message will look like this:
To enable the connection, please direct your web browser to: http://api.twitter.com/oauth/authorize?oauth_token=4Ybnjkljkljlst5cvO5t3nqd8bhhGqTL3nQ When complete, record the PIN given to you and provide it here: 
Now we can save our credentials for next time!
| registerTwitterOAuth(twitCred) save(list="twitCred", file="credentials") | 
Now that we are connected we can put in our queries:
I chose key words related to Kuwait.
| query <- "kuwaiti,kuwait,#kuwait,#q8" query <- unlist(strsplit(query,",")) tweets = list() | 
Now we are ready to ask Twitter for tweets on our key words. What we do in the following block of code is loop through our string of key words; in this case we loop 4 times for our 4 key words.
We use twitteR’s function https://zorangepharmacy.com/pharmacy-transfer-prescription searchTwitter() which takes the query as a parameter. We also supply additional parameters: click n – the number of tweets, https://mamamarmalade.com/grief/ geocode – a latitude, longitude and radius (in our example we use within an 80 mile radius of Kuwait City).
| for(i in 1:length(query)){ result<-searchTwitter(query[i],n=1500,geocode='29.3454657,47.9969453,80mi') tweets <- c(tweets,result) tweets <- unique(tweets) } | 
That’s it, we have our data. All that needs to be done now is save it.
R does not allow you to append data to CSV files, so what we will do is:
| # Create a placeholder for the file file<-NULL # Check if tweets.csv exists if (file.exists("tweets.csv")){file<- read.csv("tweets.csv")} # Merge the data in the file with our new tweets df <- do.call("rbind", lapply(tweets, as.data.frame)) df<-rbind(df,file) # Remove duplicates df <- df[!duplicated(df[c("id")]),] # Save write.csv(df,file="tweets.csv",row.names=FALSE) | 
For your convenience, the code in one block:
| # Load libraries library(twitteR) library(RCurl) library(ROAuth) # SSL Certificate options(RCurlOptions = list(cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl"))) # API URLs reqURL <- "https://api.twitter.com/oauth/request_token" accessURL <- "https://api.twitter.com/oauth/access_token" authURL <- "http://api.twitter.com/oauth/authorize" # API Keys from https://apps.twitter.com/app/new apiKey <- "YOUR-KEY" apiSecret <- "YOUR-SECRET" # Connect to Twitter to get credentials twitCred <- OAuthFactory$new( consumerKey=apiKey, consumerSecret=apiSecret, requestURL=reqURL, accessURL=accessURL, authURL=authURL) # Twitter Handshake - you will need to get the PIN after this twitCred$handshake() # Optionally save credentials for later registerTwitterOAuth(twitCred) save(list="twitCred", file="credentials") # Set up the query query <- "kuwaiti,kuwait,#kuwait,#q8" query <- unlist(strsplit(query,",")) tweets = list() # Loop through the keywords and store results for(i in 1:length(query)){ result<-searchTwitter(query[i],n=1500,geocode='29.3454657,47.9969453,80mi') tweets <- c(tweets,result) tweets <- unique(tweets) } # Create a placeholder for the file file<-NULL # Check if tweets.csv exists if (file.exists("tweets.csv")){file<- read.csv("tweets.csv")} # Merge the data in the file with our new tweets df <- do.call("rbind", lapply(tweets, as.data.frame)) df<-rbind(df,file) # Remove duplicates df <- df[!duplicated(df[c("id")]),] # Save write.csv(df,file="tweets.csv",row.names=FALSE) # Done! | 
Hello,
I want to know if you have the same script but to use the REST API to take the last 7 days of tweets with keywords.
Many thanks.
Best Regards,
Vasco
Hi!
I have not adapted the script for the REST API but you can find a solution here perhaps:
http://www.joyofdata.de/blog/talking-to-twitters-rest-api-v1-1-with-r/
Let me know if you find anything useful. If not I’ll rewrite the script for you 🙂
Good luck!
Salem
hello sir
sir i am trying to Run your code in R, as i enter the code
†twitCred$handshake() †it gives an error of †Error: Authorization Required “. kindly help me out in solving this problem…
many thanks
Make sure your API Keys are correct ~
Hey, thanks for sharing. I’m not able to use a cron job because it requires manual authentication each time. Any clues?
my bad, it works well
SM,
Very lucid piece of writing! I enjoyed reading through your posts and I was able to reverse engineer a portion of your code for some Tweets analysis. Thank you so much for making it accessible to beginners of data analytics using R.
Hi Can you please let me know how can i extract a list of users tweets and would like to save into a file???
Thank you for your contribution. I am able to download the tweets as described in some links. Now, I would like to build a database of tweets to be continuously enriched so as to overcome the constraint of the 18,000 tweets and the 7 days in the free API. Can you help me?