Note: This guide is intended for the 9/20/2012 ONA practical programming class and is a work in progress. Feel free to add comments or visit the Github repo. Follow me at @dancow for updates.
Table of Contents
-
The problem: Programming is too hard and filled with too many technical details.The solution: First learn why programming relates to your goals and ideas. Then learn programming.
-
The problem: How to gather and analyze U.S. Congressmembers tweetsThe solution: Use programming to avoid tedious, mind-numbing data-entry
-
The problem: What do we use to write code?The solution: We’ll learn how to access our systems’ command line and install one of the free and excellent text editors ideal for writing code.
-
The problem: How do we install Ruby and run Ruby programs?The solution: With Google, StackOverflow, and a little patience.
-
The problem: How do we even put URLs into our programs?The solution: Use the String class to represent text characters.
-
The problem: The URLs all share common patterns. How do we combine these together?The solution: Strings can be added to other strings. But not all types of data can be combined together.
-
The problem: How do we tell numbers and strings apart?The solution: Use the dot operator to access methods and attributes of data objects.
-
The problem: Adding strings together with plus signs is annoying and hard to read.The solution: Use the File.join method, which assembles a filename string from multiple strings
-
The problem: A lot of the text we’re using is repeated over and over.The solution: Use variables to store data for later reference in our programs.
-
The problem: How do we actually download files off the Internet?The solution: The HTTParty gem provides methods to easily download from URLs.
-
The problem: How do we deal with JSON data files?The solution: Use Ruby’s JSON library
-
The problem: How does Ruby represent the data found for a Twitter account?The solution: The Hash is a data object that lets us access the attributes contained in the Twitter data.
-
The problem: How do we work with the JSON for a page of tweets?The solution: Because a page of tweets can contain many tweets, they are stored as arrays
-
The problem: How to read from hundreds of tweets without writing hundreds of commands.The solution: Use a loop to repeat commands as many times as we need.
-
The problem: We want to get tweets from more than one CongressmemberThe solution: Use another loop to read from a list of Congressmembers.
-
The problem: We want to find the Congressmember with the most followers.The solution: Use the if statement to add logic to your program.
-
The problem: All these simple steps are adding up to many lines of intimidating code.The solution: Wrap up routines into a method call.
-
The problem: We’re writing too much boilerplate code over and over.The solution: Move the code into a separate text file and begin using our text-editor.
-
The problem: We’re having trouble understanding the code we’re reusing.The solution: Wrap these routines in easy-to-read method calls.
-
The problem: How do we loop through all the tweet pages on the local drive?The solution: Change our get_data_file method to check for file existence.
-
The problem: The data source just up and changed its location!The solution: Thanks to how we abstracted and organized our code, we need modify just a few details to adapt.
-
The problem: How to read data from a spreadsheet-like file, such as CSV?The solution: Use the FasterCSV library and Hash objects to simplify reading the datafile.
-
The problem: How can easily collect just one attribute from a collection of Twitter account info or tweets?The solution: Use collection methods to transform collections to a particular value
-
The problem: How do I filter for just the Congressmembers of a given state? Or above a certain age?The solution: Use a collection’s select method.
-
The problem: My coworkers want to work with the data but none of them can read JSONThe solution: Just loop and transform the data objects into anything you want, including strings.
-
The problem: How does Twitter activity correlate with age and political prominence?The solution: Use methods and logic to filter on any attribute you’re interested in.
-
The problem: Who uses proper English when tweeting?The solution: Use simple text-pattern matching to filter tweets by content.
-
The problem: The short-links used by Twitter obscure the actual website URLs.The solution: Let’s build our own dataset of short-links and their destinations.
-
The problem: How do we publish our findings for the Web?The solution: Creating data-backed webpages is just more loops and methods.
-
The problem: How do we make interactive webpages?The solution: First, learn about HTML and Javascript. Then, break down your data-processing tasks like before.
-
The problem:The solution: