Skip to content
This repository has been archived by the owner on Sep 10, 2023. It is now read-only.

Commit

Permalink
Merge pull request #43 from andrlik/master
Browse files Browse the repository at this point in the history
Mastodon support from @andrlik. Closes #42
  • Loading branch information
tommeagher authored Feb 12, 2018
2 parents 1519bc9 + 467443f commit 80d51cd
Show file tree
Hide file tree
Showing 4 changed files with 142 additions and 63 deletions.
47 changes: 33 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,20 +7,22 @@ This project should work in the latest releases of Python 2.7 and Python 3. By d
## Setup

1. Clone this repo
2. Create a Twitter account that you will post to.
3. Sign into https://dev.twitter.com/apps with the same login and create an application. Make sure that your application has read and write permissions to make POST requests.
4. Make a copy of the `local_settings_example.py` file and name it `local_settings.py`
5. Take the consumer key (and secret) and access token (and secret) from your Twiter application and paste them into the appropriate spots in `local_settings.py`.
2. Make a copy of the `local_settings_example.py` file and name it `local_settings.py`
3. If posting to Twitter, create a Twitter account that you will post to.
4. Sign into https://dev.twitter.com/apps with the same login and create an application. Make sure that your application has read and write permissions to make POST requests.
5. Set `ENABLE_TWITTER` to `True`. Take the consumer key (and secret) and access token (and secret) from your Twiter application and paste them into the appropriate spots in `local_settings.py`.
6. In `local_settings.py`, be sure to add the handle of the Twitter user you want your _ebooks account to be based on. To make your tweets go live, change the `DEBUG` variable to `False`.
7. Create an account at Heroku, if you don't already have one. [Install the Heroku toolbelt](https://devcenter.heroku.com/articles/quickstart#step-2-install-the-heroku-toolbelt) and set your Heroku login on the command line.
8. Type the command `heroku create` to generate the _ebooks Python app on the platform that you can schedule.
9. The only Python requirement for this script is [python-twitter](https://github.com/bear/python-twitter), the `pip install` of which is handled by Heroku automatically.
9. `git commit -am 'updated the local_settings.py'`
10. `git push heroku master`
11. Test your upload by typing `heroku run worker`. You should either get a response that says "3, no, sorry, not this time" or a message with the body of your post. If you get the latter, check your _ebooks Twitter account to see if it worked.
12. Now it's time to configure the scheduler. `heroku addons:create scheduler:standard`
13. Once that runs, type `heroku addons:open scheduler`. This will open up a browser window where you can adjust the time interval for the script to run. The scheduled command should be `python ebooks.py`. I recommend setting it at one hour.
14. Sit back and enjoy the fruits of your labor.
7. If you also want to include Mastodon as a source set `ENABLE_MASTODON` to `True` and you'll need to create a Mastodon account to send to on an instance like [botsin.space](https://botsin.space).
8. After creating the Mastodon account, open a python prompt in your project directory and follow the [directions below](#mastodon-setup). Update your `local_settings.py` file with the filenames of the generated client secret and user credential secret files.
9. Create an account at Heroku, if you don't already have one. [Install the Heroku toolbelt](https://devcenter.heroku.com/articles/quickstart#step-2-install-the-heroku-toolbelt) and set your Heroku login on the command line.
10. Type the command `heroku create` to generate the _ebooks Python app on the platform that you can schedule.
11. The only Python requirements for this script are [python-twitter](https://github.com/bear/python-twitter), Mastodon.py, and BeautfulSoup; the `pip install` of which is handled by Heroku automatically.
12. `git commit -am 'updated the local_settings.py'`
13. `git push heroku master`
14. Test your upload by typing `heroku run worker`. You should either get a response that says "3, no, sorry, not this time" or a message with the body of your post. If you get the latter, check your _ebooks Twitter account to see if it worked.
15. Now it's time to configure the scheduler. `heroku addons:create scheduler:standard`
16. Once that runs, type `heroku addons:open scheduler`. This will open up a browser window where you can adjust the time interval for the script to run. The scheduled command should be `python ebooks.py`. I recommend setting it at one hour.
17. Sit back and enjoy the fruits of your labor.


## Configuring
Expand Down Expand Up @@ -72,8 +74,25 @@ After that, commit the change and `git push heroku master`. Then run the command

If you want to avoid hitting the Twitter API and instead want to use a static text file, you can do that. First, create a text file containing a Python list of quote-wrapped tweets. Then set the `STATIC_TEST` variable to `True`. Finally, specify the name of text file using the `TEST_SOURCE` variable in `local_settings.py`

## Mastodon Setup

You only need to do this once!

```python
>>> from mastodon import Mastodon
>>> Mastodon.create_app('pytooterapp', api_base_url='YOUR INSTANCE URL', to_file='YOUR_FILENAME_HERE')
```

Then, create a user credential file. NOTE: Your bot has to follow your source account.

```python
>>> mastodon = Mastodon(client_id='YOUR_FILENAME_HERE', api_base_url='YOUR INSTANCE URL')
>>> mastodon.log_in('[email protected]','incrediblygoodpassword',to_file='YOUR USER FILENAME HERE')
```

Commit those two files to your repository and you can toot away.

## Credit
As I said, this is based almost entirely on [@harrisj's](https://twitter.com/harrisj) [iron_ebooks](https://github.com/harrisj/iron_ebooks/). He created it in Ruby, and I wanted to port it to Python. All the credit goes to him. As a result, all of the blame for clunky implementation in Python fall on me.

Many thanks to the [many folks who have contributed](CONTRIBUTORS.md) to the development of this project since it was open sourced in 2013. If you see ways to improve the code, please fork it and send a [pull request](https://github.com/tommeagher/heroku_ebooks/pulls), or [file an issue](https://github.com/tommeagher/heroku_ebooks/issues) for me, and I'll address it.
Many thanks to the [many folks who have contributed](CONTRIBUTORS.md) to the development of this project since it was open sourced in 2013. If you see ways to improve the code, please fork it and send a [pull request](https://github.com/tommeagher/heroku_ebooks/pulls), or [file an issue](https://github.com/tommeagher/heroku_ebooks/issues) for me, and I'll address it.
140 changes: 94 additions & 46 deletions ebooks.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
import re
import sys
import twitter
from mastodon import Mastodon
import markov
from bs4 import BeautifulSoup
try:
Expand All @@ -16,11 +17,15 @@
from local_settings import *


def connect():
return twitter.Api(consumer_key=MY_CONSUMER_KEY,
def connect(type='twitter'):
if type == 'twitter':
return twitter.Api(consumer_key=MY_CONSUMER_KEY,
consumer_secret=MY_CONSUMER_SECRET,
access_token_key=MY_ACCESS_TOKEN_KEY,
access_token_secret=MY_ACCESS_TOKEN_SECRET)
elif type == 'mastodon':
return Mastodon(client_id=CLIENT_CRED_FILENAME, api_base_url=MASTODON_API_BASE_URL, access_token=USER_ACCESS_FILENAME)
return None


def entity(text):
Expand All @@ -34,6 +39,8 @@ def entity(text):
pass
else:
guess = text[1:-1]
if guess == "apos":
guess = "lsquo"
numero = n2c[guess]
try:
text = chr(numero)
Expand All @@ -42,17 +49,18 @@ def entity(text):
return text


def filter_tweet(tweet):
tweet.text = re.sub(r'\b(RT|MT) .+', '', tweet.text) # take out anything after RT or MT
tweet.text = re.sub(r'(\#|@|(h\/t)|(http))\S+', '', tweet.text) # Take out URLs, hashtags, hts, etc.
tweet.text = re.sub('\s+', ' ', tweet.text) # collaspse consecutive whitespace to single spaces.
tweet.text = re.sub(r'\"|\(|\)', '', tweet.text) # take out quotes.
tweet.text = re.sub(r'\s+\(?(via|says)\s@\w+\)?', '', tweet.text) # remove attribution
htmlsents = re.findall(r'&\w+;', tweet.text)
def filter_status(text):
text = re.sub(r'\b(RT|MT) .+', '', text) # take out anything after RT or MT
text = re.sub(r'(\#|@|(h\/t)|(http))\S+', '', text) # Take out URLs, hashtags, hts, etc.
text = re.sub('\s+', ' ', text) # collaspse consecutive whitespace to single spaces.
text = re.sub(r'\"|\(|\)', '', text) # take out quotes.
text = re.sub(r'\s+\(?(via|says)\s@\w+\)?', '', text) # remove attribution
text = re.sub(r'<[^>]*>','', text) #strip out html tags from mastodon posts
htmlsents = re.findall(r'&\w+;', text)
for item in htmlsents:
tweet.text = tweet.text.replace(item, entity(item))
tweet.text = re.sub(r'\xe9', 'e', tweet.text) # take out accented e
return tweet.text
text = text.replace(item, entity(item))
text = re.sub(r'\xe9', 'e', text) # take out accented e
return text


def scrape_page(src_url, web_context, web_attributes):
Expand Down Expand Up @@ -96,7 +104,7 @@ def grab_tweets(api, max_id=None):
if user_tweets:
max_id = user_tweets[-1].id - 1
for tweet in user_tweets:
tweet.text = filter_tweet(tweet)
tweet.text = filter_status(tweet.text)
if re.search(SOURCE_EXCLUDE, tweet.text):
continue
if tweet.text:
Expand All @@ -105,6 +113,20 @@ def grab_tweets(api, max_id=None):
pass
return source_tweets, max_id

def grab_toots(api, account_id=None,max_id=None):
if account_id:
source_toots = []
user_toots = api.account_statuses(account_id)
max_id = user_toots[len(user_toots)-1]['id']-1
for toot in user_toots:
if toot['in_reply_to_id'] or toot['reblog']:
pass #skip this one
else:
toot['content'] = filter_status(toot['content'])
if len(toot['content']) != 0:
source_toots.append(toot['content'])
return source_toots, max_id

if __name__ == "__main__":
order = ORDER
guess = 0
Expand All @@ -116,18 +138,18 @@ def grab_tweets(api, max_id=None):
sys.exit()
else:
api = connect()
source_tweets = []
source_statuses = []
if STATIC_TEST:
file = TEST_SOURCE
print(">>> Generating from {0}".format(file))
string_list = open(file).readlines()
for item in string_list:
source_tweets += item.split(",")
source_statuses += item.split(",")
if SCRAPE_URL:
source_tweets += scrape_page(SRC_URL, WEB_CONTEXT, WEB_ATTRIBUTES)
if SOURCE_ACCOUNTS and len(SOURCE_ACCOUNTS[0]) > 0:
source_statuses += scrape_page(SRC_URL, WEB_CONTEXT, WEB_ATTRIBUTES)
if ENABLE_TWITTER_SOURCES and TWITTER_SOURCE_ACCOUNTS and len(TWITTER_SOURCE_ACCOUNTS[0]) > 0:
twitter_tweets = []
for handle in SOURCE_ACCOUNTS:
for handle in TWITTER_SOURCE_ACCOUNTS:
user = handle
handle_stats = api.GetUser(screen_name=user)
status_count = handle_stats.statuses_count
Expand All @@ -141,53 +163,79 @@ def grab_tweets(api, max_id=None):
print("Error fetching tweets from Twitter. Aborting.")
sys.exit()
else:
source_tweets += twitter_tweets
source_statuses += twitter_tweets
if ENABLE_MASTODON_SOURCES and len(MASTODON_SOURCE_ACCOUNTS) > 0:
source_toots = []
mastoapi = connect(type='mastodon')
max_id=None
for handle in MASTODON_SOURCE_ACCOUNTS:
accounts = mastoapi.account_search(handle)
if len(accounts) != 1:
pass # Ambiguous search
else:
account_id = accounts[0]['id']
num_toots = accounts[0]['statuses_count']
if num_toots < 3200:
my_range = int((num_toots/200)+1)
else:
my_range = 17
for x in range(my_range)[1:]:
source_toots_iter, max_id = grab_toots(mastoapi,account_id, max_id=max_id)
source_toots += source_toots_iter
print("{0} toots found from {1}".format(len(source_toots), handle))
if len(source_toots) == 0:
print("Error fetching toots for %s. Aborting." % handle)
sys.exit()
source_statuses += source_toots
if len(source_statuses) == 0:
print("No statuses found!")
sys.exit()
mine = markov.MarkovChainer(order)
for tweet in source_tweets:
if not re.search('([\.\!\?\"\']$)', tweet):
tweet += "."
mine.add_text(tweet)

for status in source_statuses:
if not re.search('([\.\!\?\"\']$)', status):
status += "."
mine.add_text(status)
for x in range(0, 10):
ebook_tweet = mine.generate_sentence()
ebook_status = mine.generate_sentence()

# randomly drop the last word, as Horse_ebooks appears to do.
if random.randint(0, 4) == 0 and re.search(r'(in|to|from|for|with|by|our|of|your|around|under|beyond)\s\w+$', ebook_tweet) is not None:
if random.randint(0, 4) == 0 and re.search(r'(in|to|from|for|with|by|our|of|your|around|under|beyond)\s\w+$', ebook_status) is not None:
print("Losing last word randomly")
ebook_tweet = re.sub(r'\s\w+.$', '', ebook_tweet)
print(ebook_tweet)
ebook_status = re.sub(r'\s\w+.$', '', ebook_status)
print(ebook_status)

# if a tweet is very short, this will randomly add a second sentence to it.
if ebook_tweet is not None and len(ebook_tweet) < 40:
if ebook_status is not None and len(ebook_status) < 40:
rando = random.randint(0, 10)
if rando == 0 or rando == 7:
print("Short tweet. Adding another sentence randomly")
newer_tweet = mine.generate_sentence()
if newer_tweet is not None:
ebook_tweet += " " + mine.generate_sentence()
newer_status = mine.generate_sentence()
if newer_status is not None:
ebook_status += " " + mine.generate_sentence()
else:
ebook_tweet = ebook_tweet
ebook_status = ebook_status
elif rando == 1:
# say something crazy/prophetic in all caps
print("ALL THE THINGS")
ebook_tweet = ebook_tweet.upper()
ebook_status = ebook_status.upper()

# throw out tweets that match anything from the source account.
if ebook_tweet is not None and len(ebook_tweet) < 110:
for tweet in source_tweets:
if ebook_tweet[:-1] not in tweet:
if ebook_status is not None and len(ebook_status) < 210:
for status in source_statuses:
if ebook_status[:-1] not in status:
continue
else:
print("TOO SIMILAR: " + ebook_tweet)
print("TOO SIMILAR: " + ebook_status)
sys.exit()

if not DEBUG:
status = api.PostUpdate(ebook_tweet)
print(status.text.encode('utf-8'))
else:
print(ebook_tweet)

elif not ebook_tweet:
print("Tweet is empty, sorry.")
if ENABLE_TWITTER_POSTING:
status = api.PostUpdate(ebook_status)
if ENABLE_MASTODON_POSTING:
status = mastoapi.toot(ebook_status)
print(ebook_status)

elif not ebook_status:
print("Status is empty, sorry.")
else:
print("TOO LONG: " + ebook_tweet)
print("TOO LONG: " + ebook_status)
17 changes: 14 additions & 3 deletions local_settings_example.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,25 @@
Local Settings for a heroku_ebooks account.
'''

# Twitter API configuration
# Configuration for Twitter API
ENABLE_TWITTER_SOURCES = True # Fetch twitter statuses as source
ENABLE_TWITTER_POSTING = True # Tweet resulting status?
MY_CONSUMER_KEY = 'Your Twitter API Consumer Key'
MY_CONSUMER_SECRET = 'Your Consumer Secret Key'
MY_ACCESS_TOKEN_KEY = 'Your Twitter API Access Token Key'
MY_ACCESS_TOKEN_SECRET = 'Your Access Token Secret'

# Sources (Twitter, local text file or a web page)
# Configuration for Mastodon API
ENABLE_MASTODON_SOURCES = False # Fetch mastodon statuses as a source?
ENABLE_MASTODON_POSTING = False # Toot resulting status?
MASTODON_API_BASE_URL = "" # an instance url like https://botsin.space
CLIENT_CRED_FILENAME = '' # the MASTODON client secret file you created for this project
USER_ACCESS_FILENAME = '' # The MASTODON user credential file you created at installation.

# Sources (Twitter, Mastodon, local text file or a web page)
SOURCE_ACCOUNTS = [""] # A list of comma-separated, quote-enclosed Twitter handles of account that you'll generate tweets based on. It should look like ["account1", "account2"]. If you want just one account, no comma needed.
TWITTER_SOURCE_ACCOUNTS = [""] # A list of comma-separated, quote-enclosed Twitter handles of account that you'll generate tweets based on. It should look like ["account1", "account2"]. If you want just one account, no comma needed.
MASTODON_SOURCE_ACCOUNTS = [""] # A list, e.g. ["@[email protected]"]
SOURCE_EXCLUDE = r'^$' # Source tweets that match this regexp will not be added to the Markov chain. You might want to filter out inappropriate words for example.
STATIC_TEST = False # Set this to True if you want to test Markov generation from a static file instead of the API.
TEST_SOURCE = ".txt" # The name of a text file of a string-ified list for testing. To avoid unnecessarily hitting Twitter API. You can use the included testcorpus.txt, if needed.
Expand All @@ -22,4 +33,4 @@
ORDER = 2 # How closely do you want this to hew to sensical? 2 is low and 4 is high.

DEBUG = True # Set this to False to start Tweeting live
TWEET_ACCOUNT = "" # The name of the account you're tweeting to.
TWEET_ACCOUNT = "" # The name of the account you're tweeting to.
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
python-twitter
Mastodon.py
beautifulsoup4

0 comments on commit 80d51cd

Please sign in to comment.