TwitterUserBackfill

A roughly-optimal (within reason) script in perl that will pull all tweets for a given user and write them to the console.

Code overview

Designed for anyone requiring one-off mass retrieval of historical tweets for twitter users, as the standard API provides no easy access to data outside of the "latest" for any user. This code first requests the latest batch of tweets, determines the lowest ID and then requests the rest sequentially from highest to lowest. Twitter's API doesn't always return data in order, however the date and timestamps included with the results make it possible to sort properly after retrieval.

For anyone needing different output than the default tweet text to STDOUT, code in the custom_action subroutine can be modified to pick and choose which fields are required, and output them in any format and to any location, database or storage device supported by perl. For a list of keys accessible through $status->{key}, see the Twitter API page: user_timeline

Complies with API use recommendations
In the event of an error, code waits an increasing number of seconds before retrying. If an error code that is Unrecoverable is reached (anything other than 502), the retry attempts immediately cease.
Minimal API requests
Default results-per-request is 125, however this can be raised closer to 200 at the cost of potential retries in the event of Twitter's API responding slowly. Line 29: my $posts_per_request = 125;
User-chosen output/post-retrieval steps
Code doesn't include any restriction on how retrieved statuses are processed, users can override the default (Line 108: sub custom_action {) routine to output data in whatever format/data storage method is required. Default: STDOUT tweet content only.
Exposes all data Twitter's API Provides
Any data returned by the API can be accessed in post-retrieval steps through $status->{field} e.g. Line 113: say $status->{text}. See twitter API docs for further field descriptions.