Lesson content is currently in draft form.
In the last chapter, we learned how to download a file using HTTParty and save its contents to a variable:
1 2 3 4 5 |
|
But when we examine the results (you can visit the show.json file in your browser), you’ll find a strange arrangement of text:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
|
At second glance, it’s not too hard to figure out what’s going on here. This json file contains some kind of listing of attributes for Rep. Nancy Pelosi’s Twitter account. For example:
"followers_count": 226259
We can guess that this line indicates that Rep. Pelosi has 226,259 followers.
What is JSON?
JSON stands for ”JavaScript Object Notation”. It is simply a lightweight data format and is a common way for web services to pass around data.
The actual details of JSON’s structure aren’t important to learn right now. We just want to know: how can we turn this text file to usable data?
If you’re following the steps from the beginning of this chapter, you have a downloaded_file
variable that contains downloaded JSON. Exactly what is contained in downloaded_file
?
(remember that downloaded_file
is actually an object with a class of HTTParty::Response, so use its body
method)
1 2 |
|
Data is organized text
As you get more experienced in collecting digital data, one fundamental concept to understand is that data is all just text (actually 1
s and 0s
but no need to go that deep).
The JSON file we just downloaded is just a large string.
Getting useful data out of this text – whether it’s comma-delimited files, Excel spreadsheets, JSON, or SQL – is just a matter of finding a pattern in the file and breaking it apart.
We’ll figure out this pattern later. For now, let’s see if someone has already written the code to interpret JSON files for us.
The json library
Ruby’s standard library contains the JSON module, which has exactly what we need. Include it in your code environment as you would a gem:
1
|
|
Given a string of text in JSON format, the JSON.parse
method will convert it to one of these Ruby collection classes: the Hash
and the Array
.
The next two chapters will cover the basic parts of handling hashes and arrays.
Exercise
Write a routine that downloads a JSON file at a given URL and parses it with the JSON
module. Check the class
of what’s returned by the parsing.
Answer
1 2 3 4 5 6 7 8 9 10 |
|
We’re beginning to add complex methods atop of complex methods now, so it’s worth reviewing what happens at each step:
- We start with a
url
, which is just aString
- Passing
url
intoHTTParty.get
gets us aHTTParty::Response
object, - The
body
method of theHTTParty::Response
method returns aString
- The
JSON.parse
method accepts aString
which it converts to a Ruby object (either aHash
or anArray
)
Note that JSON.parse
is expecting a specially-formatted string. Try passing in a normal string to that method, such as "Hello world"
Again, in this tutorial, our world is safe and narrow. You can expect all data-files to be what they say, in this case, properly formatted JSON. In the real world, this won’t be the case, and the JSON parser may choke on what is ostensibly a JSON file.