Preliminary stats of JeSuisCharlie, JeSuisAhmed, JeSuisJuif, CharlieHebdo

#JeSuisAhmed

$ wc -l *json
    148479 %23JeSuisAhmed-20150109103430.json
     94874 %23JeSuisAhmed-20150109141746.json
      5885 %23JeSuisAhmed-20150112092647.json
    249238 total
$ du -h
2.7G	.

#JeSuisCharlie

$ wc -l *json
    3894191 %23JeSuisCharlie-20150109094220.json
    1758849 %23JeSuisCharlie-20150109141730.json
     226784 %23JeSuisCharlie-20150112092710.json
         15 %23JeSuisCharlie-20150112092734.json
    5879839 total
$ du -h
32G	.

#JeSuisJuif

$ wc -l *json
    23694 %23JeSuisJuif-20150109172957.json
    50603 %23JeSuisJuif-20150109173104.json
     5941 %23JeSuisJuif-20150110003450.json
    42237 %23JeSuisJuif-20150112094500.json
     5064 %23JeSuisJuif-20150112094648.json
   127539 total
$ du -h
671M    .

#CharlieHebdo

$ wc -l *json
    4444585 %23CharlieHebdo-20150109172713.json
        108 %23CharlieHebdo-20150109172825.json
    1164717 %23CharlieHebdo-20150109172844.json
    1068074 %23CharlieHebdo-20150112094427.json
      69446 %23CharlieHebdo-20150112094446.json
     185263 %23CharlieHebdo-20150112155558.json
    6932193 total
$ du -h
39G     .

Total

Preliminary and non-depuped, we’re looking at roughly 74.4G of data, and 13,188,809 tweets after 5.5 days of capturing the 4 hash tags.

Avatar
Nick Ruest
Associate Librarian

Related