Twitter Datasets and Derivative data

Tweets to Donald Trump (@realDonaldTrump)

59,261,490 tweet ids for tweets directed at Donald Trump (@realDonaldTrump), collected with Documenting the Now's twarc. Tweets can be “rehydrated” with Documenting the Now’s twarc, or Hydrator.

twarc hydrate to_realdonaldtrump_ids.txt > to_donaltrump.jsonl.

Tweets from May 7, 2017 - June 21, 2017 of the dataset used a combination of the Filter (Streaming) API and Search API. The Filter API failed on June 21, 2017. From June 23, 2017 forward only the Search API was used to collect. This is done every 5 days on a cron job.

Collection is ongoing, and this dataset will be periodically updated with additional tweet ids sets.

#JeffSessions tweets

2,278,757 tweet ids for #JeffSessions collected with Documenting the Now's twarc. Tweets can be “rehydrated” with Documenting the Now’s twarc, or Hydrator.

twarc hydrate to_realdonaldtrump_ids.txt > to_donaltrump.jsonl.

#paradisepapers tweets November 3-25, 2017

1,797,260 tweet ids for #paradisepapers collected with Documenting the Now’s twarc from November 5-26, 2017. Tweets can be “rehydrated” with Documenting the Now’s twarc (https://github.com/DocNow/twarc). twarc.py hydrate paradisepapers_ids.txt > paradisepapers.json. Or with Documenting the Now’s Hydrator: https://github.com/DocNow/hydrator

#climatemarch tweets April 19-May 3, 2017

681,668 tweet ids for #climate collected with Documenting the Now’s twarc from January 22-26, 2017. Tweets can be “rehydrated” with Documenting the Now’s twarc (https://github.com/DocNow/twarc). twarc.py hydrate climatemarch_tweet_ids.txt > climatemarch.json.

#MarchForScience tweets April 12-26, 2017

1,276,220 tweet ids for #MarchForScience collected with Documenting the Now’s twarc from January 22-26, 2017. Tweets can be “rehydrated” with Documenting the Now’s twarc (https://github.com/DocNow/twarc). twarc.py hydrate MarchForScience_tweet-ids.txt > MarchForScience.json.

#WomensMarch tweets January 12-28, 2017

14,478,518 tweet ids for #WomensMarch collected with Documenting the Now’s twarc from January 21-28, 2017. Tweets can be “rehydrated” with Documenting the Now’s twarc (https://github.com/DocNow/twarc). twarc.py –hydrate WomensMarch_tweet_ids.txt > WomensMarch.json Also included are the logs files for the Filter API and Search API queries. The Filter API query captures the cumulative number of dropped tweets.

The fall of Aleppo tweets; Aleppo 2016-12-13 through 2016-12-29

8,595,589 tweet ids for aleppo tweets captured during the fall of Aleppo in December 2016. Tweets can be “rehydrated” with Documenting the Now’s twarc (https://github.com/DocNow/twarc). twarc.py –hydrate aleppo_tweet_ids.txt > aleppo.json

Tweet ids for final Tragically Hip concert

228,086 tweet ids for “TheHip, hipinkingston” captured during the Tragically Hip’s final concert in Kingston, Ontario in August 2016. Tweets can be “rehydrated” with Documenting the Now’s twarc (https://github.com/DocNow/twarc). twarc.py –hydrate th_final_concert_kingston_tweet_ids.txt > th_final_concert_kingston.json

#YMMfire tweets

Tweet ids for #YMMfire tweets. Tweets can be “hydrated” with Ed Summers’ twarc (https://github.com/edsu/twarc). twarc.py –hydrate ymmfire-ids.txt > ymmfire-tweets.json. Hydrating will recreate the original tweet(s) in json format, provided the content is still available on Twitter.

#jcdl2016 tweets

Tweet ids for #jcdl2016 tweets. Tweets can be “hydrated” with Ed Summers’ twarc (https://github.com/edsu/twarc). twarc.py –hydrate jcdl2016-tweet-ids.txt > jcdl2016-tweets.json. Hydrating will recreate the original tweet(s) in json format, provided the content is still available on Twitter.

#thechalkening tweets

Tweet ids for #thechalkening tweets. Tweets can be “hydrated” with Ed Summers’ twarc (https://github.com/edsu/twarc). twarc.py –hydrate thechalkening-ids-20160412.txt > thechalkening-20160412-tweets.json. Hydrating will recreate the original tweet(s) in json format, provided the content is still available on Twitter.

#panamapapers tweets

Tweet ids for #panamapapers tweets. Tweets can be “hydrated” with Ed Summers’ twarc (https://github.com/edsu/twarc). twarc.py –hydrate panamapapers-ids-20160413.txt > panamapapers-20160413-tweets.json. Hydrating will recreate the original tweet(s) in json format, provided the content is still available on Twitter.

#NDP2016 tweets

#MakeDonaldDrumpfAgain tweets

Derivative data for #MakeDonaldDrumpfAgain tweets. Tweets can be “hydrated” with Ed Summers’ twarc (https://github.com/edsu/twarc). twarc.py –hydrate MakeDonaldDrumpfAgain-tweet-ids.txt > MakeDonaldDrumpfAgain.json. Hydrating will recreate the original tweet(s) in json format, provided the content is still available on Twitter. This dataset is the combination of hydrated http://hdl.handle.net/10864/11310 tweet ids, and htttp://hdl.handle.net/10864/11270.

#paris #Bataclan #parisattacks #porteouverte tweets

Tweet ids for #paris #Bataclan #parisattacks #porteouverte tweets. Tweets can be “hydrated” with Ed Summers’ twarc (https://github.com/edsu/twarc). twarc.py –hydrate paris-tweet-ids.txt > paris-tweets.json. Hydrating will recreate the original tweet(s) in json format, provided the content is still available on Twitter.

#elxn42 tweets (42nd Canadian Federal Election)

Tweet ids for #elxn42 tweets. Tweets can be “hydrated” with Ed Summers’ twarc (https://github.com/edsu/twarc). twarc.py –hydrate elxn42-tweet-ids.txt > elxn42-tweets.json. Hydrating will recreate the original tweet(s) in json format, provided the content is still available on Twitter. This dataset is the combination of hydrated http://hdl.handle.net/10864/11310 tweet ids, and htttp://hdl.handle.net/10864/11270.

#JeSuisCharlie, #JeSuisAhmed, #JeSuisJuif, #CharlieHebdo tweets

Tweet ids for #JeSuisCharlie, #JeSuisAhmed, #JeSuisJuif, #CharlieHebdo tweets. Tweets can be “rehydrated” with Ed Summers’ twarc (https://github.com/edsu/twarc). twarc.py –hydrate %23JeSuisCharlie-ids-20150112.txt > %23JeSuisCharlie-tweets-20150112.json

Related

comments powered by Disqus