twitwatch

Twitwatch is a tool to listen to Twitter's streaming API and store tweets.

Requirements

Twitwatch needs a recent version of Python 2 and the Python requests library.

Usage

Copy settings_example.py to settings.py. Change the username, password, filter, and directory to store files in. You can create additional dictionaries in settings if you want to run multiple crawlers. Then you can run the crawler by running crawl.py followed by the name of the field in settings to use. In the example settings, you can start the link crawler by running this command:

./crawl.py link

Notes

The crawler process automatically kills the process and restarts it every 15 minutes. This is a crude way to guarantee that the crawler starts working again even if the connection to Twitter breaks down.

This is based on @bde's TwitterStreamSaver, but it uses the Requests library instead of libcurl. The requests code is based on an example from Requests' documentation.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.gitignore		.gitignore
README.md		README.md
crawl.py		crawl.py
settings_example.py		settings_example.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

twitwatch

Requirements

Usage

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

twitwatch

Requirements

Usage

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages