Lesson content is currently in draft form.
When we were retrieving data online, I carefully avoided the task of fetching more than one page of tweets per user.
This is because, for the sake of brevity, I didn’t want to explain the many different ways a web server could respond to a request for an invalid file.
But now that the files are all on our hard drive, this is an easier problem to deal in the abstract.
This chapter will be a refresher on the if
statement
File.exists?
The File
class (again, part of Ruby’s core) has a handy exists?
method, which returns true
or false
if a file actually exists at a given filename:
1 2 |
|
Note: You may be wondering, how does a method name have a question mark? I thought only alphanumeric characters could be used? Yes, generally…this is a Ruby-specific convention. You have the option of using the ?
to indicate to users that this method returns true
or false
, though it’s just a convention…you could be a jerk and return whatever you want. Also, the !
is allowed in method names. I won’t explain that right now.
How to loop through numbers?
The pattern for tweet pages is:
1 2 3 4 5 6 7 8 |
|
So we just need to change the argument corresponding to the page number. So how to do a loop from 1
to 5
? You could think of it as looping through an Array
of numbers:
1 2 3 |
|
Ugh, but who wants to enter each number into the array? Luckily, Ruby has another collection object called Range
. Here’s one way it can be used:
1 2 3 |
|
The else
So it’s easy to set a Range
from 1
to whatever number we need. But what if there are fewer tweet pages than the endpoint of our range?
Sounds like a use case for File.exists?
1 2 3 4 5 6 7 |
|
FYI, that exclamation mark !
serves to negate the expression. So in the above case, if File.exists?
returns false
– because the file doesn’t exist – the !
makes it true
, causing the if
branch to execute.
That’s admittedly confusing. So instead, we can use the else
statement which allows us to define what happens when the if
condition fails:
1 2 3 4 5 6 7 8 9 |
|
Break a loop
It seems wasteful to continue a loop if we know the condition is going to be false after the first non-existent file is found.
The break
keyword allows us to break out of the current loop:
1 2 3 4 5 6 7 8 9 10 |
|
A more careful data-getting method
We could test for file existence in a lot of places in a program. One place where it seems logical is in our get_data_file
method, which currently will crash the program if an invalid file name is passed in.
Try to modify it so that it returns false
if the given filename does not exist:
1 2 3 4 5 6 7 8 9 10 |
|
This is actually problematic because get_twitter_user
and get_tweets_page
depends on get_data_file
. And they need to be modified to deal with a returned false
value.
Sound like a dilemma? Welcome to the joy of professional programming.
For now, I vote to not modify get_data_file
to have this logic. This will affect the answer to the exercise below.
Exercise
We now know enough to collect all the tweet pages (on our hard drive) for a given user. Use the existing get_tweets_page
method, write a new method called get_all_user_tweets
.
- It should only take in one argument
- It should not loop wastefully once the last page has been reached
- It should return an array containing all the tweets collected across the pages.
FYI, this is how you add two arrays together:
1 2 3 4 5 |
|