Transcript
[0:00] Music.
[0:08] Well, it's that time of the week again. It's time for Chitchat Across the Pond.
This is episode number 779 for November 25th, 2023, and I'm your host, Alison Sheridan.
This week, our guest is Bart Bouchat, back with another installment of Programming by Self. I believe we're on number 156. How are you doing today, Bart?
I am doing good. It feels like it was only last week we were talking. Oh, wait, it was.
So we take three months off and then we're back twice in a row.
Well, what are you going to do?
Yeah, we're the bosses of podcasting. You wait for one for ages and then two come along at once. But anyway, hopefully they're two fun ones.
[0:41] All right, we're going to continue with JQ now. Yeah, so last time I gave you a taste and I showed you with examples, I explicitly said I wouldn't explain yet because they'll make your head explode right now to show you that it can be used to pretty print things.
And we went into that in detail, in fairness.
And then I said we can also use it to pull pieces of information out of things.
And then I said its final magic power will be to actually transform and build derivative information from the JSON we feed into it.
Now, that's where we're getting to. We're not going to get there all today, but we are going to make some substantial progress on the pulling information out piece.
[1:22] In particular, we're going to extract specific pieces.
The next thing we're going to do after today will be querying, which is a different thing, because then you're not looking for this exact thing, you're looking for something like this, which is a whole different kettle of fish. But anyway, we need to build up.
So building up we shall do.
And I think we'll roll back to the very start of how the JQ command works.
Just a little refresher here.
So I showed you last time that one of the very popular uses is to take the input from standard in as a way of getting JSON information into JQ.
And a very common place in the world that JSON data comes from is the web.
So we use the curl command to pipe our json into jq.
And then I also said that the first argument to jq is the filter, and any arguments after the first arguments are input files.
So you could have arbitrarily many, you know, a second argument, third argument, fourth argument, fifth argument.
But obviously, if you don't have a filter...
Then you can't have a second argument. So, something I didn't say last time, but I'm going to say very explicitly now, is that in jq, the period symbol, on its own, represents the entire input.
[2:41] So it's like your placeholder for the thing I've been asked to process.
So if you need to ask jq to pretty print a file, it's jq space period space name of file, file, which is the jq command, the filter, show me everything, followed by the file to get the JSON from.
Does that make sense? Yeah, yeah it does.
So the period means everything, but the period is also our anchor for descending into the file.
So period is like the root directory on your file system.
So when you say it on its own, you mean the entire file system.
Well, when you say period and so on, you mean the entire JSON object, or piece of JSON.
So if you want to get, if the piece of JSON you have is a dictionary, to get into a specific key in that dictionary, you just use the name of the key.
So we will be using the same example files from last time, one of which is a package.json file from the this-time.me open, source web page thing that I wrote. And so it has a property named name.
So we can pull that property out by saying jq and then the filter .name and then the filename this-ti.me-package.json.
[4:04] And that will reach into the dictionary at the top level of that file, and then go to the name property and give you its value.
[4:12] So it's jq single quote dot name. How do we know to put it in single quotes?
So remember I said last time that jq filters use a lot of special characters, and I said just always put it in single quotes because otherwise sooner or later it will go terribly horribly wrong.
So I am doing as I said.
OK, if we were just doing JQ dot, we wouldn't put it inside single quotes, though, correct? I would.
Oh, you would? Yeah, I'm an absolutist. When I say get into the habit of always quoting your JQ filters because sooner or later it'll bite you on the backside, I just do it.
I know you don't have to. I know you don't. No, but I like absolutist ways of doing it.
When it's a question, then I sit there going, I don't know, single quotes, double quotes, no quotes, I don't know. Yeah, it saves me thinking, and that's easier.
[5:01] So it's probably no surprise that if you use the name of a key to descend into the top-level dictionary, if the key you descend into is also a dictionary, you just keep putting more names, separated by periods.
So if you look inside that file, it has a key named Debugs, which is itself a dictionary, and one of the keys inside that dictionary is URL.
So to descend into the top-level dictionary is .bugs, and then .url will take us one deeper.
And then we are now all the way into the second-level value.
So jq, our single quote, .bugs, .url, close single quote, and then the name of the file. Yeah, that makes sense. Yeah.
[5:43] So this syntax is very simple, and it's also very JavaScript-y.
So it's kind of nice for us. But what if the name of the key...
The key and the key-value pair has special characters inside it.
Well then, we use JSON syntax for a string, and so we then double-quote the, key with the funny characters.
So inside a package.json file, you will have a top-level key named dependencies, which is itself a dictionary.
And there's an entry in there for every package your package needs, and it's very, very, very common to put minus signs, or dashes, as a separator in package names.
So the minus symbol also means subtraction, which we haven't looked at arithmetic yet, but jq can do arithmetic, so the minus symbol...
Exactly. So in order to get at isit-check, which is the name of one of the packages that this-time.me relies on, we must double-quote that name when we try to use it as a key.
So jq and then the filter .dependencies.open.
[6:54] A double quote is-it-check close, double, quote. And all of what you just said was in a single quote. Yeah.
And similarly, a lot of things also use the at symbol and the forward slash.
That's just you're just on all sorts of heidens nowhere there with all those special characters and a dash now that I look at it.
So again, you just stick it inside your double quotes and you're away.
So, .dependencies.openyourdoublequoteatfontawesomeforwardslashfontawesome-free, will.
[7:24] Allow you to get the atfontawesomeslashfontawesome-free, key out of the dependencies key in the top-level dictionary.
Okay, that was a lot of words that probably are hard to listen to in just audio, but when you see it written in BART show notes, it's pretty clear.
How do you know when to do that? Do you do it without those double quotes first and then go, whoa, that's wrong, and then you put the quotes on?
Is that the robust process?
My brain generally says that if there's anything in there other than letters, numbers, or an underscore, I'm just going to double quote it because it'll always work. You could double-quote anything. You know, you can double-quote URL and, it'll be absolutely fine.
But if we're querying for something, we don't know what it's going to look like.
We don't know it's going to have minuses in it.
Okay, but that's what I'm saying. This week, we are not querying.
This week, we're extracting. So I very carefully named the show Extracting Data.
What is the difference between extracting and querying, Burt?
[8:22] Extracting is pulling something you know out. Oh, okay, okay.
And in this case, we know it's is-it-check that we're going to be looking for, to find out what the value is that goes with that key, we know that the thing we're extracting has the dashes in it.
That's why we know to put it in double quotes.
Exactly. So querying slash searching is for next time.
And it's really important, right? Because, well, you'd be surprised how often you're extracting.
Because if you're trying to make use of a config file inside your script, so imagine that you're writing a script to interact with something and you need to pull some information out of a config file.
Well, you know the information you're trying to pull, right?
You're trying to pull the username and the password out of the database object or something.
So, extracting is used a lot, right?
If you get the Nobel Prizes from a data file, we know that the top-level thing has a key called prizes.
So we know that that's where the prizes are. So a lot of times you're extracting.
Okay, that makes sense. And then there are times when you're querying.
Yeah, okay. So as I say, querying involves some extra concepts that I'm not going to dump on us today because these show notes are not that short today and I still have a few concepts to pour into your brain.
[9:48] So dictionaries, that kind of takes care of dictionaries. You just use the key value pair names, basically. And if you need to quote them, you quote them, and you separate it with period symbols.
So that's not too bad.
Happily, arrays are also very JavaScript-like.
[10:03] So we're going to switch to our NobelPrizes.json file now for our examples, because it contains a top-level key named prizes, which is an array.
So if we want to experiment with pulling things out of an array, hey presto, we have ourselves an array there.
So to get at a specific element in an array, it's exactly the JavaScript syntax.
Open square bracket, the numeric index of the value you want, close square brackets.
Being a programming language, very sanely, it counts from zero.
So it's just like you're used to in JavaScript.
You know, name of array, open square bracket, zero, close square bracket to get the first element, open square bracket 1 to get the second element, and so on and so forth.
So the file contains the Nobel Prizes in reverse chronological order.
So the most recently given out prize is with the filter, so dot for the root of our piece of JSON.
[10:58] Prizes that we descend down into the dictionary element, which is prizes, that is an array. Dictionary?
Dictionary or array? Okay, so if you open the file, at its very, very top level, it is a dictionary which contains the key prizes, and prizes is an array.
Okay, so the entire file is a dictionary with prizes as its key?
Yes. In fact, it's a dictionary with one key.
Yeah, right. That's what I said, or meant. And then inside that is the array.
Yeah, so the value of prizes is an array.
So prizes is an array.
That's sort of how we'd say it. Got you. Okay.
And so prizes, open square bracket, zero, close square bracket, is the first element of that array.
Prizes one is the second element, and so on and so forth. And that's nice.
JQ has a really nice extra feature I love. You can also go from the back.
[11:54] So minus one is the last element of the array.
Minus 2 is the second last element.
So if 0 gives us the most recent Nobel Prize, then .prize's open square bracket minus 1 will give us the oldest Nobel Prize.
Oh, that's kind of fun. So you can see it was in 1901. That's pretty recent.
[12:17] Well, I mean, Nobel was working fairly close, like Nobel was, wasn't that, you know, dynamite wasn't that old at World War I.
And that's where it started?
Yeah. So Nobel was trying to save lives by inventing an explosive that unlike black powder didn't kill more miners and tunnel builders than, you know, like when they were building the American transcontinental, it was horrific trying to get through the Rockies because the explosives were killing more workers than they were destroying rock.
And so Nobel had this amazing idea that I should make an explosive you can carry on a horse and cart and it won't explode.
And the governments of the world went, That's fantastic. I know what we'll do that will blow up everyone else.
And it became the most deadly weapon in World War One. And Nobel was horrified by it. He was horrified before World War One.
And the fact that his idea for saving lives turned into one of the most effective murder weapons of his generation.
Out of guilt, he started the Nobel Prize. I did not know any of that.
He made a fortune out of dynamite.
[13:23] Fortune. And then he poured it into the prizes. So that was his that was his Carnegie Hall. You know.
Anyway, sorry. Yeah. Nobles are fun. The reason there isn't an official Nobel Prize for maths is because he didn't think it was valuable, which is why we have a few of them.
Anyway, his prejudices live on in strange ways. Right. So we can go into the front of the array or the back of the array. Great.
We can also take a slice of an array.
So JavaScript is not unusual in having a function .slice, which allows you to take a piece of an array.
And in JQ, you can do the same. To get a slice of an array, the magic character is the colon, and then you put an index before and after the colon, although as we'll see momentarily, those indexes are optional if you want to have a shorthand for start of and end of.
[14:20] But basically the colon symbol is the key here. So we're still working inside square brackets, but instead of just giving a single number, we're going to have one or more numbers with a colon between them.
And that tells JQ that what we want is a piece of this array.
We don't want you to take out a single value. We want you to give us a new array that contains a subset of where we started.
[14:46] And to show this in action, what I'm going to do is echo an array into JQ, and that array is going to be the numbers one, two, three, four, five, so that the content of the array matches the index.
So the zeroth element is the number zero, the first element is the number one, the second element is the number two, because that makes it easy to see what the heck this syntax is doing.
Does that make sense? I understand conceptually, you want to tell them what you're going to type to do that?
Yeah, so the echo command sends a string to standard out. So I am echoing to standard out the JSON for an array of those six numbers.
So echo inside quotation marks, square bracket 0, 1, 2, 3, 4, 5, close our square bracket, close our string, and then we pipe the output of the echo command as the input to jq, and then we can use our filters to mess around with it.
So before we start using our filters, let's just talk about how jq defines those two numbers on either side of that colon.
[15:52] So the first number is the array index where you want to start.
So if you want to start at the first item in the array, you can start with a 0.
If you want to start at the second item, you start with a 1.
If you want to start at the third item, you start with a 2. So far so straightforward.
The second number is not the index of the last item you want to include.
[16:17] It's the index of the first article you don't want.
And I don't understand this at all. It made my head hurt so much.
Because initially I thought maybe it was a length and that would be sane.
But I'm going to prove to you it's not a length. So we have our array 0, 1, 2, 3, 4, 5.
And if we pipe that to jq with the filter period, open square bracket, 0, colon, 3, close square bracket, we get back a new array which contains 0, 1, and 2.
So the first index is 0, great. The second index is 1, we still got it. The third index is 2.
And then 3 is 1 higher than 2, so it doesn't bother with the fourth thing, which is at index 3.
[17:06] So it's sort of like if you're doing like an for i equals 0, i plus plus, i is less than 3.
Yeah, yes, that's exactly what it's like.
As soon as you get to the third one, then it's like, okay, now I can bounce out.
Yeah, and I don't give you it. I bounce out and I don't give you the last one you mentioned.
That made my head hurt, and for a moment I thought, maybe it's a length, maybe it's where you start and how long you continue for.
So, to test that theory, I put the same array into the filter, dot open square bracket one colon four close square bracket, and I figured if it's a better length, then I should get one two three four.
No, I get 1, 2, 3. So it's starting at the thing with the index of 1, which is our number 1, and then it's doing 2, 3, and then it's stopping and not giving me the thing at index 4.
I 100% guarantee you I'm going to ask you about this later and say you never told us, just so you know.
I'm probably going to do that as well, because it makes no sense to me, and things that make no sense to me get forgotten. Yeah.
[18:11] Now, thankfully, if you want to get the end of an array, you don't have to know how long the array is.
If you leave off, if you just omit the second number, then it will go to the end.
So if you would like everything from the third item on, it will be two colon.
Okay. Yeah, that makes sense.
We can also give our end value with a negative number.
So if we would like to stop one before the end we give minus one.
If we would like to stop two before the end we give minus two.
So turns out that you can also leave off, if you want to say start at zero, you can leave off the first number.
So if you would like everything in the array except for the last two digits, it's colon minus two.
That's actually sensible. So two colon and colon minus two, they both make sense to me, but none of the rest does.
So, if you said 1 colon 4, it would mean do 1, 2, 3, and don't do 4.
But if you do minus 2, it means minus 2.
Yeah, I know. That also seems inconsistent. It's not like minus 3.
I guess it's because there's no minus 0. I don't know. I don't know.
Yeah, like I say, my brain can't make it consistent, but I can prove to you with this array 0, 1, 2, 3, 4, 5, that this is what it does.
[19:41] So as I say, you can slice an array and the magic symbol is the colon.
[19:47] Now, something you may or may not have noticed while we were working around, particularly in the first examples where we were outputting our URL and outputting the name of the package, it was always coming out in JSON syntax.
So, the JSON syntax for a boolean is just true or false, so that's bare text, which is kinda what we usually want, and the JSON syntax for a number is just the digits.
But the JSON syntax for a string is a double-quoted string.
Which is why, when you tried to get the name, you got the name inside double quotes, and when you tried to get the URL for the bugs, you got the URL inside double quotes. because jq outputs JSON syntax.
[20:32] And the idea is that you can use JQ to sit between something that talks JSON and something that expects JSON, and transform the JSON between the talker and the receiver, which is very useful.
But if you're using JQ to sit between something that talks JSON and a shell script, or a terminal command, or something you show to a human being, you don't want those superfluous quotation marks. case.
You want the actual string.
Well, they thought of that. So you can stop that from happening with the minus minus row minus output flag, or it's shorter friend, minus lowercase or.
So if we go back to our package.json file, we can say jq.author, which.
[21:20] Gets you the author of the package. And that will print out my name inside double quotes.
And we can use that inside a terminal command.
So we could say echo check out this cool tool buy and then use the $open round bracket syntax to execute a terminal command inside our string and then include the results of that terminal command in the output.
And what we execute is jq.author and then the name of the file, which then prints out, check out this cool tool by, quote, Bart Bouchots.
So what it looks like you're doing is insulting me with those air quote things we do.
Yeah, this cool tool by this Bart Bouchots guy. This Bart Bouchots.
Everybody knows Bart Bouchots.
Exactly. It was ridiculous. So if we do the same thing but we throw in the minus oral flag in the jq command, we get a much more respectful, check out this cool tool by Bart Bouchots.
This totally normal guy. And dash R again means raw output? Yes. Okay.
[22:23] So beyond being funny from the sarcasm standpoint, we need to do that because we want to be able to pipe it into something else, like a shell script doesn't expect to see the value in quotes.
Exactly. And like I say, if you need it in JSON format, that's what you get by default.
But if you want it in human format, minus minus row minus output, or just minus or. Okay.
Now, I need to tell you something I've been conveniently hiding from you by being careful not to accidentally run a command that causes this to happen.
But jq is designed from the ground up to do things in parallel.
So if you give jq one piece of json, it will apply its filter to that one piece of json and give you one output.
If you give jq two pieces of json, it won't complain.
It will apply the filter to both pieces of JSON and give you both outputs.
That's a good thing, right? Oh, it's a fantastic thing, but I haven't said that's what it does. So you may not have expected it to do that.
So if we pipe two JSON files at it, then we can get it to print out the content of both of those files.
So, I have in this month, in this installment's folder, or zip file, I have two extra files we didn't have last time. One of them is called ip-bartb.json.
[23:52] And the other one is called ip-podfeet.json.
[23:56] And they contain some information about the IP addresses of both of our web servers that I got from a web services API that spits JSON out at you when you tell it an IP address. I think we used it as one of our examples last time, actually.
No, we did. I definitely know we did, because I opened the show notes from last week to copy it out.
[24:17] So those two files contain information about two IP addresses.
So if I say cat space IP star, then it will print both files to standard out.
And if I then pipe that into jq, then And jq receives two pieces of json.
And if you do that just without a filter, it will print you out the content of both files.
It doesn't make it into one jQuery array. It gives you one piece of json, then a newline, and then another piece of json.
There is no separator between the two dictionaries it prints out, because it's printing out one value, newline, another value.
I thought you always had to have a filter with jq?
If you don't give it a filter that is equivalent to... You need to have a filter if you specify the files, because the files are the second argument.
But since we piped those into JQ, as part of standard in, to JQ...
They were standard out of the cat command, which became standard into jq, right?
Did I say that properly? You said that perfectly. And so that means that jq is default.
Well, no, no, it's good. Picture perfect. I couldn't out-pedant you if I tried.
[25:34] So then with the two files now as the standard into jq, since we're not going to be saying the file after we say jq, then we don't need to give it a filter?
Yeah, and no filter at all is equivalent to the filter period.
[25:52] That's like the default value of the first argument is period.
You can't skip giving it that filter if you want to give it a filename at the end.
Exactly, because it has to go second. And how do you go second if there is no first?
Who's on second? Oh wait, no, let's not do that.
And if we ask JQ to do a filter on those two files, we can just specify our filter.
So So those two files each contain a dictionary, and one of the key-value pairs in the dictionary is named continent code.
So if I say cat IP star pipe that to jq with the filter period continent code, it gives me two strings as outputs, EU and AM, because my server is in Europe and yours is in America.
And it gives them one on each line, because it's giving us the first output, new line, the second output and because I didn't specify a minus or a flag it has outputted those as quoted strings.
[26:54] Okay. Now, it saw those two inputs as two different things to be processed separately.
Sometimes you want to apply the JSON filter once to the sum total of all of your inputs.
And the way you do that is that you tell JQ to build an array on the fly and turn your multiple inputs into one array of those values.
And that is what... Yeah.
And JSON calls that, or JQ calls that slurping.
So you can either use minus minus slurp or minus S, which is the shorthand for minus slurp. Or minus minus slurp. Okay.
Remind me, we need to talk after we're done so that nobody gets confused, that the minus minuses are turning into an N-dash?
They are in one or two places, and I've just fixed that one, and I fixed a bunch of them earlier. Yeah.
Okay, good. I need to teach this Mac. I know that's one of the banes of my existence when the Mac goes, Oh, let me just fix that for you.
Yeah, on most of my Macs I've gotten around to fixing it. Yeah.
On most of the Macs I turn off. I go through and I type really slowly.
I type dash, and I'm going, wait a minute. Now I'll give you the second one.
If you wait long enough, it'll sometimes let you do two in a row.
Well, I go into System Settings Keyboard and turn off the bloody Replace.
[28:19] I also turn off the Smart Quotes. Time saver, right?
Oh, it drives me nuts. Okay, so now I've kind of stomped over what we've learned here.
Okay, so we're going to use the slurp, or minus minus slurp, or minus S, in order to say I want to smash these two JSON files into one, and then apply the filter with JQ.
Yeah, and the way it smashes them into one is it turns them into an array.
So if you just run that there, if you pipe those into jq minus s without a filter, what you'll see is that instead of seeing two dictionaries, you see an array that contains two dictionaries. So it's now become one piece of JSON.
So, let's see, so JQ, so what we've got in the show notes, cat IP star, so we just said, you know, output the two files and then pipe it to JQ, but JQ minus S is going to say, I want to put these, make these an array with these two JSON objects, JSON dictionaries now in it. Okay.
[29:22] Okay. Okay, so now if we want to do anything intermediate to that, like let's say we wanted to keep going and we wanted to actually extract the continent code with that before after the minus S. That's to go before, right?
Actually it can go anywhere to minus, because jq is one of those commands where it will take its options anywhere, so if you give it the option anywhere in the argument list it'll be happy.
Which is nice of it. I don't like that idea. I don't like it.
I think it should be after the filter.
[29:56] Well, whatever works for you, because it doesn't care. So you do you, and it'll do its own thing. It'll be happy.
Okay. But it's just like, I'm getting it in my head that you've got to have the filter first, and then the stuff after.
So we've got to try it that way. Okay. Well, if that works, then do that.
[30:15] The only reason you would do this is because you have a reason that you want to turn that input into an array.
You don't want it to have multiple... You don't want the filter applied multiple times. You want the filter applied once.
So we haven't learned how, but if you need a filter to count how many elements there are, how many JSON objects are there in those five or six files, well, then you could take that array and count the elements in the array.
And that will tell you, oh, actually, this folder contains the JSON representing 500 IP addresses, that was good to know.
But you couldn't do that as easily. That's kind of a silly example.
But you could do things like average together.
Well, look, you can do cool things with arrays.
So there are times when you want it as an array. I can't tell you stuff I can't tell you yet. Very annoying.
[31:05] So I have a question here. So I just said, we talked about doing the extraction before after the minus S.
If I I just tried to do jq and then use this single quote dot continent code unquote and then minus s and it barfed all over me, probably because it's an array.
So I need to do square bracket zero dot continent code maybe?
Something along those lines? Right, but that would be, so what you're doing there is you're saying, I don't want to deal with these individually, I want to deal with these as a giant big collective.
And then you're saying, but I want the specific piece of information from one of them. So you wouldn't, if the problem to be solved is that you want to pull out the continent code, that's what you get by default, so you wouldn't use the slurp.
[31:49] Okay. Right? If you don't need an array, don't turn it into an array.
You can do things to arrays. You can do things to arrays, and so sometimes you want arrays.
I do. Yeah. I mean, as a programmer, it doesn't take much imagination to imagine how powerful it can be to have something as an array, and to be able to just get it automatically. I know how to loop over them.
Right, and that is something JQ can do for you as well.
But not today, not today. Okay.
Okay, so JQ is happy to work in parallel, but it's not only happy to work in parallel, it doesn't care if you have more than one output for every one input.
So you can make branches effective. So if you think of it like four pieces of information come in, and you could have eight come out, or 80 come out.
You can explode one piece of it. You can explode one JSON object into many outputs, which is cool.
There are lots of ways of doing this, depending on the problem you're trying to solve. We're going to look at two of them today.
So the first thing you may often want to do is you may actually want two specific pieces of information from your JSON object.
So if you want both the name and the version from the this-time.me package.json file, well.
[33:16] That's two pieces of information. And the way you do that is you say the filter for one, comma, the filter for the other, all inside your single quote, so it is in fact two filters in one filter.
Oh, yeah, yeah, yeah, well, and now I'm making it, it's even making me happier that they're inside single quotes, because you'd certainly need that in order to put the comment in there.
That works really well. Okay. Yeah. That's cool.
Now, I've told you that it can take any number of inputs and do its thing, and then it will give you the appropriate number of outputs.
So in this case, I gave you one dictionary as the input, and I asked for two values, and so you get out the name followed by the version.
If I give you 3 inputs and I ask you for 3 things, do I get 1 1 1 2 2 2 3 3 3, or do I get the 3 outputs from the first input, followed by the 3 outputs from the second input, followed by the 3 outputs from the third input? Well, the answer is the second thing I said.
[34:17] So if we take our two files about IP addresses, and we stick those into the filter .cityname, comma, dot continent, comma, dot continent code, we are asking it for three pieces of information for every one input.
And what we get back is Amsterdam Europe EU, San Francisco Americas AM.
Which is very clearly, my city, my continent, my continent code, then your city, your continent, your continent code.
So it's all three outputs for the first input, followed by all three outputs for the second input.
That really makes a lot of sense to me, that it would take the first input, loop over it, and then take the second input and loop over that.
It would really bother me if it hadn't come out that way.
Yeah, me too. I'm glad it didn't try to do... What's that thing you call it on the printer?
Interleaving or something? No, that's interleaving is on the screen.
Either way, I'm glad it didn't collate the pages wrong.
I like it, right?
Now, another thing you may want to do is the opposite of the minus minus slurp flag, is you may want to take an array and split it into separate pieces.
[35:34] Programmers colloquially call that exploding an array. I don't know if it's a technical term, but it's so commonly used it may as well be a technical term.
It's a jargon term anyway. way.
So the syntax to explode an array in jq is simply to have two square brackets with nothing in them.
And that will explode every element in the array into a separate output.
So inside our this-time.me package.json, file, there is a, key, at the top level dictionary called keywords, which is an array.
And that array contains JavaScript and time zones. So it's just an array of two strings.
And so if we use the filter .keywords() we get two outputs, JavaScript and time zones.
It's not the array with two elements, it's JavaScript time zones.
[36:34] Okay, so I'm trying to think of why you would want to do that, but maybe that gets back to when you're looping over an array, maybe you want to get all of the values.
Yeah, so I'm going to give you a little sneak peek into the future.
We're not going there today, but it is right next next week.
So remember in our examples last week, we had all of these pipes inside the single quotes, because inside JQ pipes are very important.
JQ is designed to be chained. So you would tend to do something like, get me all of the inputs I want to do something to, and then send them on to another filter which is going to do something to them.
So you may want to go find a specific array somewhere in a giant big dictionary, explode it out into its values, and then do something to all of those values.
And so what would immediately follow those two square brackets would be the pipe symbol, and then another filter.
And that second filter would happen, not once, but that second filter would happen once for everything in the array. Okay, okay.
So that is why exploding arrays is very important. We will be doing it a lot.
And it turns one input into many outputs.
[37:47] What if you don't want it all? What if you only want two very specific parts of an array?
Well, we can do that by using a comma inside the square brackets.
So if I would like two values from the Nobel Prizes array, I would like the first... Ooh, I said first in my show notes and I put a one instead of a zero.
Don't make a silly programmer mistake. I'll be fixing the show notes before the humans see it.
That'll never happen again. That will never happen. Barton never makes that mistake of counting from one instead of zero, even though he's been programming since 1997.
If you would like the first and last Nobel Prize, then you can say dot prizes open square bracket zero comma minus one, because you can count from the back as well as from the front.
[38:39] And you could put comma minus two to get the second last one.
You know, you can You can put as many commas in there as you want, and you will get back exactly those elements as separate outputs.
You could put square bracket 1 comma 37 comma minus 1.
And you would get the second, and the 38th, and the last one.
Yeah. That's it exactly.
Yeah. And they can be in any order. You're dead right. You don't have to give them in any particular order. So if you need the second last one first, just put minus two first, you know, you can have them in any order you like, just separate them with commas, and you're all good.
[39:21] Now, very conveniently for me, you trod in a very, very common error, which is when you try to do something to an array that isn't an array, or that an array can't do, or vice versa, you try to do something array-like, to something that isn't an array, and jq gets all cranky at you.
If you would like it to be tolerant of that error, and instead of shouting at you...
Just give no output. And that can be a very valuable thing to do.
If you imagine you have five or six filters separated by commas, and some of them are sometimes not going to exist in your input, and you're perfectly happy to have them not exist, you don't want errors all over the place if they don't exist.
You just want them not in your output if they don't exist.
Then you can stick a question mark at the end of the thing that's going to cause you trouble, and that will suppress the error.
[40:18] So, the most common example of this, the one I can make happen on demand lots of times, is to treat something that isn't an array as if it's an array.
Because a lot of the time, jq will very gracefully answer with null if you ask it something that doesn't exist.
So our package.json file does not have a top-level key named waffles.
And if you say jq.waffles this-time.me-packer.json, it won't, give, you.
[40:47] An error because there is no waffles. It will say, well, waffles is null.
You asked me to go into this dictionary to find this key. There's no value for that, so null.
And it will do something that JavaScript the language will not allow you to do.
So in JavaScript, if you had an object and it didn't have a property named waffles, and you then try to get the sub property of waffles, It will say null.
I can't get .pancakes of null.
[41:14] Jq doesn't care. Jq will just say, the pancakes item in the non-existent array also doesn't exist.
Null. No error. Just null. Yeah.
[41:26] Anything inside nothing is nothing. Have some nothing. You can even go into the non-existent, waffles as an array and say, give me waffles open square bracket 1 and it will go the nonexistent second entity in the nonexistent array is null and it won't shout at you, which is kind of nice.
But the moment you ask it to do an implicit loop...
And exploding an array is an implicit loop.
It's basically saying, I want you to give me the first output, and the second output, and the third output, and keep doing that until there are no more outputs.
If you ask it to iterate in its internal lingo over an array that does not exist, it gets cranky.
Because it goes, but what? That doesn't make sense.
So if you say jq.waffles open square bracket close square bracket, no.
It throws an error at you straight away. We cannot iterate over null.
[42:18] But what if you just wanted that, well, if there are no waffles, output nothing.
You just stick a question mark in the end and jq stops shouting at you and just gives you no output.
On the end where? Wait, on the end where?
So see the way we have .waffles, open square bracket, close square bracket, question mark.
That basically means if that square bracket fails to explode, I don't care.
Okay, I was doing it for the audience. I could see it, but they might not be able to.
This is why, remember I keep saying this is where you provide value, because without you being here, I say silly things or I don't say things I should.
Well, that's interesting. Now, this doesn't answer the basic question, is why aren't there any waffles in our package.json?
[43:02] I mean, there should be. I know. Why aren't there any waffles?
Why aren't there any waffles in my kitchen?
Dessert is in the oven as we record this, but it's all healthy.
It's like, what is it today? It's raspberries and blackberries roasting together nicely in some maple syrup.
Oh no, trust me, they are yum. Now, they are going over chocolate pancakes, so they're not that healthy. But I don't have any waffles. At least they're pancakes.
I have some protein waffles recently that are quite delicious.
Well, my favourite gluten-free brand, they're called Promise, which is a cool name, and they're made in Ireland.
They've just released a whole new range of pancakes. So this week we're trying their chocolate ones, and next week we have their blueberry ones. So, we shall see.
Speaking of gluten-free, I noticed that my grandchildren's shampoo is gluten-free.
I never assumed there was gluten in my normal shampoo. I think it's just one of those things, right? I think it was sugar-free, too.
It's probably petroleum-free. Actually, it might not be petroleum-free.
No, it might not be. It's alcohol-free.
Talk about a buzzword, right? It's like, oh, look, it's gluten-free.
[44:15] And so is my soap. I don't believe you can ingest gluten. So are my batteries.
Maybe you can ingest gluten through your skin, though. I don't know.
Well, my darling beloved is celiac, and at no point has the handling of bread resulted in prolonged agony.
Whereas the consumption of a crumb most certainly does.
Okay, there we go, good, we know what it means. Well, you'll have to get back to us on this pancake thing.
But now back to JSON.
Well, back to JSON for my final thoughts, really. So I figured these are enough new concepts to bring in today.
And we've arrived at a very natural breaking point here.
[44:54] Because in the first installment, I gave you an overview. And then I showed you how to pretty print with all the options for how we make the pretty printing appropriately pretty.
And now we have done all of the straightforward extraction where we're saying give me exactly this.
Not a search, but a specific path.
So that's the equivalent I guess of typing a file path in the terminal or going into the finder and clicking a series of folders as opposed to typing into the find box and doing a search.
So, obviously, if I keep describing these JSON objects as being databases, and they are used as databases, well, what do you do with a database? You query it, right?
Give me all the Nobel Prizes for chemistry in the last 10 years.
We have a database of all the Nobel Prizes. They all have a year.
I should be able to do that.
What is the average numbers of winners for chemistry prizes?
I should be able to do that, right?
That involves getting all the prizes for chemistry, breaking them down by year, averaging them for each year, and then printing out those final values.
There's a lot of work going on in there.
But JQ can do each of those pieces, and it allows us to combine those pieces together to get us to where we want to go.
[46:17] So we're not going all the way to that advancedness in one step.
The next step is the ability to search, not extract. Well, that sounds fun.
It is fun. Let me ask you one final question here.
We went over quite explicitly the fact that jq is a command at the terminal level and jq is a language, right? Yeah.
And that you would always put the command inside a code block and the name of the library or whatever, the language would not be inside a code block.
How do you know which one you're saying when you've got it in your sentences here?
Like, if I was to try to figure out how to double check and make sure you did it right, I don't have a clue how I would do that.
Generally speaking, I will also have the word command after the one that is the command. And I will usually use the word filter after the one that is the language.
Ah, okay. Okay. JQ filter chaining, JQ functions we're going to talk about, but we have JQ command, can process, blah, blah, blah. Okay.
All right. I'm going to hopefully not worry my pretty little head about trying to figure out whether you got it right or not, and just assume.
I can't tell the difference yet.
So I am 80% sure I get it right, because I'm pretty good at always backticking every terminal command.
[47:43] Hey, we did the whole of taming the terminal. I got pretty good at that.
[47:48] But I mean, when you've got a paragraph of text, like your opening line says in the previous installment we got a glimpse of what jq can do at that point we're talking about the language and then you said and we looked at some examples of jq in action and now that jq is in single code that's in a code block I wanted to have both of them in the opening sentence but I can't tell jq in action isn't jq in action isn't jq the language.
[48:13] Well, the command makes the language go. Okay. I think I shouldn't worry my pretty little head about the distinction, is I think what we should do.
Yeah, when it's important, I'll draw your attention to, and sometimes it just isn't.
Okay. Okay. We'll go with that. That sounds good. All right.
Well, this is fun, Bart. I enjoy this.
I like, I think I really like processing data. I think that makes me happy.
I enjoy that. I had a feeling this would be up your alley, because of your slight liking of Excel. Just teeny-tiny bit.
Yeah. Yep, yep, yep. You betcha. You will be happy to know, Alison, that in my work life, I am becoming not just an Excel tolerator, which I had become, not just an Excel fan.
I think I'm becoming an Excel evangelist.
Oh, nice, nice. And I'm thinking in formulae, I don't like the syntax, but I do like what it can do.
I think the next step is I end up liking its weird syntax. But for now, I'm just making it do powerful things and liking it.
How about pivot tables? You do any pivot tables yet?
[49:22] The data I have doesn't need pivoting, unfortunately.
What it does need... Basically, conditional formatting seems to be the most important thing I spend my life doing. with a lot of complex rules for it because it's really hard to get people to see important things without that.
Yeah, absolutely. And I hate people who go around manually coding it all and then getting it wrong. And then when I look at the spreadsheet, my brain hurts.
It's like, no, no, no, no, no. If you're going to put... If the colour has a meaning, it has to be an algorithm and I had to be able to look at the cell and tell the algorithm.
[49:59] I remember just a hundred years ago in, it wasn't Macworld magazine, I think it was Mac User magazine, which was before that.
They had an article about how to formulaically show your data like a green bar report.
And for the children amongst us here, in the dot matrix printer days, we used to print out our code and things on green striped paper.
[50:29] And it was the only way you could read all the way across a real wide-format piece of paper.
And so, this would let you create it so that your cells would always be green, white, green, white, green, white.
And no matter what you did to the data, they were always green, white, green, white.
And this was before conditional formatting. I mean, this was like hard-coding it by hand.
And I had that dog-eared and, you know, with a Post-it note on it.
Actually, Post-it notes probably weren't even invented when I had this.
And I would pull that thing out, and I would code up my things, that only my spreadsheets would have green bar.
See, I'm telling you, this love of this goes way back. Well, my first year of programming, we had to hand in our assignments on the dot matrix printer and it came out in fan fold on a three foot wide printer on green and white striped paper.
There you go. That's green bar. That's green bar. And it was three foot wide.
So you could have long lines of code.
And yeah, yeah, that was, that's the way it was done. And good times, good times, good times. And it was fast.
It was really fast and noisy.
Oh, the lab assistant was never thinking, oh, I wonder how someone finished and is ready to submit an assignment. Wait, wait, wait.
[51:43] Anyway, memories, I can hear it in my head. All right. We should stop going down memory lane.
Indeed we shall. And until we speak again, lots and lots of happy computing.
If you learn as much from Bart each week as I do, I'd like you to go over to lets-talk.ie, and press one of the buttons over there to help support him.
He does 98% of the work here, I'm just the stooge that listens to him and asks the dumb questions.
If you go over to lets-talk.ie, you can support him on Patreon, you can donate via PayPal, or you can use one of his referral links.
I really hope you'll go over and help him out. In the meantime, you can contact me at podfeet or check out all of the shows we do over there over at podfeet.com.
Thanks for listening and stay subscribed.
For sick Allison.