-
Notifications
You must be signed in to change notification settings - Fork 4
/
Copy pathPKG-INFO
199 lines (168 loc) · 9.57 KB
/
PKG-INFO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
Metadata-Version: 2.1
Name: fch
Version: 1.0.12
Summary: A Python library to find historical Twitter follower count using the web archives
Home-page: https://github.com/oduwsdl/FollowerCountHistory
Author: Mohammed Nauman Siddique
Author-email: [email protected]
License: MIT
Description: # Twitter Follower Count History via Web Archives
Follower Count History is a Python module that collects Twitter follower count from the web archives using [MemGator](https://github.com/oduwsdl/MemGator) for a given Twitter handle. The module parses the follower count by identifying various CSS Selectors that match the follower count element on the historical Twitter pages for almost every major overhaul their page layout has gone through. The program collects all of the memento data points by default.
[1] Mohammed Nauman Siddique. 2020. Historical Twitter Follower Count Via Web Archives. (August 2020). Retrieved August 05, 2020 from https://ws-dl.blogspot.com/2020/08/2020-08-05-historical-twitter-follower.html
[2] Miranda Smith. 2018. Twitter Follower Count History via the Internet Archive. (March 2018). Retrieved July 25, 2020 from https://ws-dl.blogspot.com/2018/03/2018-03-14-twitter-follower-count.html
## Installation and Usage
### Dependencies
* Python 3
* bs4
* warcio
* requests
* R* (Optional: to create graph)
### Usage
```shell
$ git clone https://github.com/oduwsdl/FollowerCountHistory.git
$ cd FollowerCountHistory
$ pip install -r requirements.txt
$ ./__main__.py [-h] [--st] [--et] [--freq] [--out] <Twitter handle/ Twitter URL>
```
#### Install from pypi
```shell
$ pip install fch
$ fch [-h] [--st] [--et] [--freq] [--out] <Twitter handle/ Twitter URL>
```
To just create the graph from a csv file
```shell
$ Rscript twitterFollowerCount.R <CSV file path>
```
### Docker
We have published a docker image at [oduwsdl/fch](https://hub.docker.com/r/oduwsdl/fch) with the tag <b>2.0</b>, which can be used to run this tool as following:
```
$ docker container run --rm -it -v <Output Directory>:/app -u $(id -u):$(id -g) oduwsdl/fch:2.0 [options] <Twitter Handle>
```
Example of output being mapped to the current directory
```
$ docker container run --rm -it -v $PWD:/app -u $(id -u):$(id -g) oduwsdl/fch:2.0 --out --st=20200101000000 --et=20200331000000 --freq=2592000 joebiden
```
Example of docker command for generating follower graph
```
$ docker container run --rm -it -v $PWD:/app -u $(id -u):$(id -g) --entrypoint /bin/bash oduwsdl/fch:2.0
I have no name!@736a209b64d6:/app$ ./fch/__main__.py --freq=2592000 joebiden| Rscript twitterFollowerCount.R
```
### Options
```
Follower Count History (fch)
positional arguments:
thandle Enter a Twitter handle/ URL
optional arguments:
-h, --help show this help message and exit
--st Memento start datetime (YYYYMMDDHHMMSS)
--et Memento end datetime (YYYYMMDDHHMMSS)
--freq Sampling frequency of mementos (in seconds)
-f Output file path (Supported Extensions: JSON and CSV)
-v, --version Report the version of fch
```
* --st: Default is set to Twitter birth date (2006-03-21 12:00:00). It accepts the memento datetime in [RFC 8601](https://www.iso.org/iso-8601-date-and-time-format.html) fourteen digit variation.
* --et: Default is set to the current datetime. It accepts the memento datetime in [RFC 8601](https://www.iso.org/iso-8601-date-and-time-format.html) fourteen digit variation.
* --freq: Default is set to download all the mementos
* -f: Accepts JSON and CSV file paths for output. If no value is provided, output is returned to stdout in CSV format.
## Output
The program can generate output in JSON and CSV format. The -f option directs the output of CSV or JSON files to the supplied file path. By default, the module returns the outut in CSV format to the stdout.
### Output Fields
Field| Description
---------|------------
MementoTimestamp | memento datetime in [RFC 8601](https://www.iso.org/iso-8601-date-and-time-format.html) fourteen digit variation
URI-M | link to the memento
FollowerCount | follower count from the URI-M
AbsGrowth | follower count increase/decrease w.r.t. the first memento
RelGrowth | follower Count increase/decrease w.r.t. the previous memento
AbsPerGrowth | pecentage increase/decrease in follower count w.r.t. the first memento
RelPerGrowth | pecentage increase/decrease in follower count w.r.t. the previous memento
AbsFolRate | daily Twitter follower growth rate w.r.t. the first memento
RelFolRate | daily Twitter follower growth rate w.r.t. the previous memento
### Sample Outputs
JSON Output
```json
[{
"MementoDatetime": "20200101001959",
"URIM": "https://web.archive.org/web/20200101001959/https://twitter.com/JoeBiden",
"FollowerCount": 4048208
}, {
"MementoDatetime": "20200131120028",
"URIM": "https://web.archive.org/web/20200131120028/https://twitter.com/joebiden",
"FollowerCount": 4142510
}, {
"MementoDatetime": "20200301001210",
"URIM": "https://web.archive.org/web/20200301001210/https://twitter.com/JoeBiden/",
"FollowerCount": 4202148
}]
```
CSV Output
```csv
MementoDatetime,URIM,FollowerCount,AbsGrowth,RelGrowth,AbsPerGrowth,RelPerGrowth,AbsFolRate,RelFolRate
20200101001959,https://web.archive.org/web/20200101001959/https://twitter.com/JoeBiden,4048208,0,0,0,0,0,0
20200131120028,https://web.archive.org/web/20200131120028/https://twitter.com/joebiden,4142510,94302,94302,2.33,2.33,0.0358,0.0358
20200301001210,https://web.archive.org/web/20200301001210/https://twitter.com/JoeBiden/,4202148,153940,59638,3.8,1.44,0.0297,0.02339
```
### Output to stdout
```shell
$ fch --st=20200101000000 --et=20200331000000 --freq=2592000 joebiden
```
### Output to files
**Command to return output to the file path**
```shell
$ fch --st=20200101000000 --et=20200331000000 --freq=2592000 -f=output/joebiden.csv joebiden
$ fch --st=20200101000000 --et=20200331000000 --freq=2592000 -f=output/joebiden.json joebiden
```
**Command to create graphs for each handle**
```shell
$ Rscript twitterFollowerCount.R <file path>
```
* List of Graphs for each Twitter handle:
File Name| Description
---------|------------
`<Twitterhandle>`-follower-count.jpg| shows Twitter follower growth over time
`<Twitterhandle>`-follower-growth-relative.jpg| shows Twitter follower growth w.r.t. previous memento
`<Twitterhandle>`-follower-growth.jpg| shows absolute number and pecentage Twitter follower growth w.r.t. to first memento
`<Twitterhandle>`-follower-perc-growth-relative.jpg| shows Twitter follower growth over time w.r.t. previous memento in percentage
`<Twitterhandle>`-follower-rate-relative.jpg| shows new followers added per day w.r.t. previous memento
`<Twitterhandle>`-follower-rate.jpg| shows new followers added per day w.r.t. first memento
## Examples
* Command to find Twitter follower count for a Twitter handle from all the mementos since the account creation up until today
* Output to stdout as CSV
```shell
$ fch joebiden
```
* Output as CSV file
```shell
$ fch -f=joebiden.csv joebiden
```
* Command to find Twitter follower count for a Twitter handle with a monthly sampling of the the mementos since the account creation up until today
```
Frequency = 3600*24*30
Frequency = 2592000
```
* Output to stdout as CSV
```shell
$ fch --freq=2592000 joebiden
```
* Output as CSV file
```shell
$ fch -f=joebiden.csv --freq=2592000 joebiden
```
* Command to find Twitter follower count for a Twitter handle with a monthly sampling of the the mementos within a specified start and end timestamp
* Output to stdout as CSV
```shell
$ fch --st=20200101000000 --et=20200331000000 --freq=2592000 joebiden
```
* Output as CSV file
```shell
$ fch -f=joebiden.csv --st=20200101000000 --et=20200331000000 --freq=2592000 joebiden
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: License :: OSI Approved :: MIT License
Classifier: Topic :: Internet :: WWW/HTTP
Classifier: Topic :: System :: Archiving
Classifier: Topic :: System :: Archiving :: Backup
Provides: fch
Description-Content-Type: text/markdown