CCATP_2023_12

Transcript

[0:00] Music.

[0:08] Well, it's that time of the week again, it's time for Chit Chat Across the Pond.
This is episode number 781 for December 9th, 2023.
And I'm your host, Alison Sheridan. This week, our guest is Bart Buschotts, back with Programming by Stealth, installment 157.
How are we doing today, Bart?
I am doing fine. 157, isn't that amazing? It really is. You'd think I'd know everything by now.

[0:34] I don't, and I work at this five days a week every week and I'm still nowhere near knowing everything. It's impossible.
I was joking with our boss that I'm trying to document myself out of existence, but I'm not even keeping up, let alone catching up.
I like it. I like it. Because you're learning faster than you can document.
Yeah, things change quicker than I can write them down. I do more stuff than I could ever document ever.
But nonetheless, it's still important to document stuff when you can, because you don't want to make yourself the single point of failure, because then you can't get sick and stuff. Or get promoted.
Or start off for a month of annual leave or get promoted. Right, right. Exactly.
Well, speaking of documentation, we have some good show notes.
I've actually already been through them. I'm like teacher's pet this week.
I know what we're going to talk about.
I'll still get lost, but we'll give it a go.

[1:22] So we are on our third, I think, installment of JQ, which is a language for querying the JSON markup language, I guess.
It's a way of writing data in a structured format. And so we started off by explaining the big picture and learning how to use it to make our JSON look pretty, which is already valuable because a lot of APIs spit out one giant, big, long line of glop, which is very difficult to understand, but JQ makes it pretty.
And then we learned how we can start to, I would say, extract very specific pieces of information last time.
So we sort of made a surgical strike to exactly the point in the data structure we wanted, but it wasn't searching. It was like going to an address instead of finding something.
Because we knew that we had to go in, you know, the key named this, and then the fourth element of the array, and then the key named that.
There was no, find me something like this. It was, go exactly here. — Right, right.
So I guess the difference between SatNav and Googling for the best coffee in the neighborhood.

[2:26] Good analogy, you're right. Yeah. So today, we are moving towards, our aim at the end of today is to be able to do that kind of querying, to be able to ask.
So we've been using as an example, a really fun dataset I found, which is information about all the Nobel Prizes as a JSON file.
And so we've been using that in our examples in the previous two installments, and we're going to keep using it this time, and next time, and the time after, and the time after that.
I've been planning. And we want to be able to answer very human-y questions by the end of this installment, like, who won the 2000 Nobel Prize in Medicine?
Which prizes were won by people with the surname Curie? We want to be able to query the data for those kind of things.
But I think what's interesting about the way JQ works is, what my brain just heard was is an if-then-else sort of thing.
Like, if year equals 2000, or if surname equals curie, but it doesn't work that way. You're not doing if-then-else statements in this.

[3:25] No, it's basically, it does it as a series of, well, they call them filters, right?
The language of JQ is they are filters, right? So the thing we've been writing is called a filter, and when we used a comma, we could do two filters, one after the other.
And what we're going to learn in about three minutes is that you can make the filters talk to each other.
Like in the terminal, you can chain multiple simple commands.
JQ is about taking filters and connecting them together. And you basically plumb your problem as a sequence of these filters.
And each filter is nice and simple and easy to understand. And the magic comes from how you connect them together.
Which always reminds me of our good friend Tim Verporten, who used to say that those little apps in your menu bar did one thing and did it well.
Well, a JQ filter does one thing. And if you try to make a GQ filter do lots of things, you will cry, achieve nothing and get very cranky, because this is what I used to do.
And now I have completely embraced the idea that you just have lots of filters, you chain them together, you connect them together in lots of different ways.
And that's where the power comes from.
This all comes down to your rub of LEGO bricks, doesn't it?
Yeah, it does. Yeah, totally. To be fair.

[4:30] So the difficult thing about this installment is why I've kind of been holding it back until our third installment is that in order to get from anywhere to here.
We need to learn three things at once, which means this has to be an episode with three new ideas, no matter what I do.
And that's... I'm always worried when I have to do three things at once.
So we need to learn that we can chain filters together, like we do in the terminal with terminal commands.
We need to learn that there are something called an operator.
So in the JavaScript language, we know we have plus as an operator.
It takes whatever's on the left, whatever's on the right, adds them together and makes new value.
So that's how an operator does things, right? whatever's on my left, whatever's on my right, I do something. But JQ also has operators.
We're going to meet some of them in future installments, but the ones we're interested in today are the ones for doing logic-y stuff.
Less than, equal to, because how are we going to query data if we can't express that we want something the same as, or something greater than, or something less than? They're obviously very important operators.
And JQ doesn't have operators for everything, because it doesn't make sense to have an operator for everything.
There's not enough symbols on the keyboard, and you'd forget them, for a start.
So JQ takes the rest of its functionality as functions.

[5:49] We've done enough programming between the shell script and JavaScript not to be surprised that there are functions. And so JQ does indeed have functions.
So we need to learn how to chain lots of filters together, use operators, and use functions, questions, and then we can answer those very simple questions I just gave you.
So let us dive straight in with filter chaining.

[6:11] So I told you at the very very start, and that I was going to keep telling you, that whenever you're writing the jq filter part of the jq command in the terminal, always single quote your filters, because otherwise it is not a case of when you will get caught out.
Sorry, not in case of if you'll get caught out, but when. Because jq's syntax learns or copies the shell.
So if you put it in single quotes, you're saying to bash or zsh, this isn't for you.
This is an argument you are to pass completely unaltered to the jq terminal command and it shall interpret it.
And the biggest reason for this is because the symbol that JQ uses to connect one filter to another is the pipe, i.e.
Exactly the same symbol for bash.
Why do I get the feeling that you've been caught out more than once?
Oh yeah. That you have to keep telling us.

[7:09] Oh, while writing these very show notes. It's like, what?
That error literally doesn't make any. Oh, that's not an error from JQ.
That's an error from Bash. OK, why did Bash? Oh, yeah.
All the time. All the time.
So you will. I'm sorry, I keep saying it over and over again.
I'm glad you keep reminding us, though.
Yeah, so on the terminal, we're used to the idea that standard out becomes standard in.
Well, I told you last time that in JQ world, each JQ filter takes one or more inputs, applies itself one after the other after the other to each of its inputs and will produce one or more outputs.
And it doesn't have to be the same number. So when we use the two square brackets to explode an array, we took one input, an array, and that filter spit out lots of outputs.
Or when we use the syntax for slicing an array, we took one array in and got, say, the first three elements or the last two elements or whatever.
So it can be an n to n or an n to m, I guess, because both numbers can be different.

[8:08] And as we start to use functions and things, the amount of transformation that can happen within a filter becomes ever greater.
But at the end of the day, there is going to be an amount of inputs, they're going to be processed in parallel by this filter to produce an amount of outputs, and they will become the input to the next filter in the chain.
And the data ripples its way through, and each filter transforms the data in some way. Maybe it pulls a small piece out and throws the rest away, as we'll learn much later, not today.
You can do math and stuff with the data. You could take in a giant big array and spit out four, which might be the average number of Nobel laureates per year or something.
That's probably not true, I just made that up, but you can do that kind of thing.
Your filters can do anything to transform the data, but it's an amount of data in, I I'll do myself once for every input and I will produce an amount of output.
And then the next filter does the same, and the next filter does the same, and the next filter does the same.
On and on and on and on you go. And so that pipe is very much your friend.

[9:17] So we have already learned in the previous installment that we can use the comma symbol.

[9:24] To basically... Don't think of it in your mind as and. Think of it and also.
I've been trying to think of a piece of English to say that doesn't sound like it's a logic thing because when you say .year, .category, you're saying I want the year and I also want the category.
Okay. So basically there are two filters. It's like a list.
It's like a list, exactly. Like I want the year, I want the category, I want the blah.
Now, they can be anything a filter can be.
So, as we start to learn that we can call functions, each one in the comma list could become something really complicated, but the comma just means, and also.
The other thing is that the commas are less... The pipe takes precedence.
So if you have something pipe, something comma something, the pipe happens first.
So, in the example in the show notes, we take the prizes and we explode it, and then we pipe that into .year, .category.
Well, it's not that you get all of the years from all of those explosions and then one .category comes out.
No, no, no. The pipe happens first, so both year and category come out for each and every single prize.
Okay, so it finds the first prize, it gives you the year in the category.
Finds the second prize, year in the category. Exactly.

[10:51] So when you run that filter, when you run that JQ command, what you will see is year category, year category.
So that will basically give you the listing of all the Nobel Prizes that have ever existed, right?
The 2002 Prize for Physics, the 2002 Prize for Medicine, the 2002 Prize for Peace, yada, yada, yada. So they all come in one after the other.
And it is important to understand that the pipe happens first, and then you have the and also, which is what the comma gives you. Okay.
The other thing, as you start to build these things up to be more complicated, you're going to end up wanting to group your filters because you may want to do one big thing and then do some little piping around with lots of things to make one final answer that gets piped to somewhere else.
So you may need to decide to group your pipes.

[11:40] Because you may not want it all to go the one way.
So I did my best to find a way to show you that in the show notes, slightly contrived example, but it's an interesting thing that there were some years, actually, okay, let me, I'm getting lost in my own show notes here.
So let's go back to our example. So we take dot prizes, open square bracket, close bracket, that explodes the prizes array into lots of separate things.
That thing gets piped into dot year comma dot category. So that will come those tuples and everything, which is great.
And the roundy brackets are for grouping things together. So if we wanted to make our query above a little bit better, so don't just list the year and the category, we also want the surnames of each recipient of each prize.
Well, now we can't just pipe that straight through because the surnames aren't there at the top level next to category and year.
They're inside an array of laureates. Oh, that's sub to dot, so there's prizes and then laureates are inside prizes.

[12:48] Exactly. So the prizes contains a dictionary, which has a year, which is just a number, a category, which is just a string, and an array called laureates.
And then in that array called laureates, you have a dictionary which has surname and first name, and also why you got the prize, which is called the motivation.
So if we want to get the surnames out as well, then we're going to have to explode the laureates as a third thing.
So we take our existing query, and we put another comma to say, and we're going to do something else.
And the something else has to first explode the laureates and then get the surname.
If we don't use the parentheses, the pipe will say, the way JQ would see it is you want the category, the year, the laureates, and then pipe all of that to .surname, which will fail spectacularly, because 2002 is a year.
You try to get the dot surname of 2002, it will tell you that's poop.

[13:50] You try to get the dot surname of an array, it will say, no, that's not the index in this array.
So that's nonsense. So we need to tell JQ, no, no, no, no, no.
I want you to explode it out, get the surnames, and then take all of that as the third thing.

[14:05] It's not a thing anymore. Can I describe it now, how it's written in text?
So he's got JQ, and of course we're going to be in single quotes because we don't want to be talking to the shell.

[14:16] And so first he explodes dot prizes, and then he, the array dot prizes, and then he pipes that, and the three things he wants to know are dot year, comma, otherwise known as and also, dot category, comma, and also, and then the third thing that he wants he's got in parentheses, species, which is the array.laureates, exploded, pipe2.surname.
So that thing together says, go into the laureates, pull me all the surnames, and make that be the third thing. So .year, .category, and .surname.
And then the file name. Now that one third thing becomes a list.
So what actually ends up happening is that you see 2002 Physics, three or four names.
2002 Chemistry, three or four names. 2002 Peace, three or four names.
And back, back, back, back. So it's still happening in order, but the third thing becomes many things.
Because there are usually more than one Lariat.
Now, I found, while trying to make this example, the perfect excuse to show you something I told you last time.
So last time I said that sometimes something doesn't exist and jq gets very, very cranky and says, well, I can't go into that array, it doesn't exist.
And And if you want to make it not give an error, you just put a question mark after it.
And when I tried to run that very innocent looking query without that question mark on dot laureate square bracket question mark, I got an error.

[15:43] Which made my head hurt because that implied that there were years where there were no Nobel prizes.

[15:49] There are years where there were no Nobel prizes. It's called the First World War and the Second World War. They didn't give out prizes. They gave the money to charity.
And so what's in the data set is an entry that has no laureates and a note saying, this year the prize money was donated to blah, blah, blah.
Oh, that's interesting. What's the motivation?
Well, so the motivation is the explanation of what it is they gave it as charity to.
But they called it overall motivation instead of putting it inside the array of laureates. So there is no .laureates when the years where there are no winners.
Which I didn't know until I ran a query that blew up. And then I put the question mark in and it was well again.
So it just proves that the question mark works to stop errors.
It just will ignore a year where there are no laureates.
They're just not included. Interesting.

[16:41] Okay. The reason I'm pausing is because when I read through these notes the other day, I thought I tested it without the question mark just to see how annoyed is it.
And I thought it came back... It will give you some output. I thought it worked, but it just said null for a bunch of them.
I thought that's what I remembered seeing. Okay, that's not what happened to me.
What happened to me is I got a few of them, and then the first time I hit a gap, it stopped going any further in time.
Oh, okay. Oh, JQ error at no null prizes cannot iterate over null. Okay.
That's what I would have thought, yeah. Okay, that makes more sense.
Yeah, because what you've told it is, I want all of the entries inside null, and it's gone and went, there are no entries inside null. That's a nonsense statement.
How do I get the list of nothingness?
Yeah, it'd be nice if it told you where. It says, at nobelprizes.json colon zero.
Well, I bet it wasn't. Yeah, so the entire... Well, you see, the nobelprizes.json file has the JSON in one giant big line, so it is giving you the line number. The problem is the entire file is one line.
Ah, well, that's not very helpful.
Also they gave it to you in computer-centric line zero. Who the hell thinks of line zero? Right, right.

[17:57] Okay, so let us move on to our second, that's our first piece of information, which I'm hoping wasn't too bad.
We can chain filters together and we can group them with parentheses.
So far so not terrifying, I hope.
So second piece of new information.
Operators inside jq. This is very much in keeping with how other languages work.
The operator goes in the middle, and then you have a value on the left, a value on the right, and the whole lot gets replaced with a new value calculated said by the operator, whatever the operator says in the tin is what will result.

[18:33] Some languages allow you to have things like unary operators that take only the input from the left and don't expect anything from the right, like plus plus, for example.
In JavaScript we can say x++.
That's an operator but only has one side. JQ, way more simplistic.
Operators have two sides. Full stop, end of story. There is no unary stuff.
Which is interesting, but we'll come to that shortly.

[18:59] Now, this is a moment where I need to... I initially wrote these show notes and left this bit out because it was so obvious to me I'd forgotten that...
I didn't even think about it.
It's not that I decided not to include it, it's that I just walked right by it and never saw it. But actually, I made an assumption that I shouldn't have made.
So everything we have done up to now, we have told jq the names of things.
Dot year, dot category, dot laureates, right? they are names within the data structure.
We haven't actually told jq a value. We haven't told it true or waffles, right?
A string, right? We have only given it names of things.
But if we're going to do a comparison, is the year greater than 2000?
Well, we now have to specify a value. What are we being greater than?
So we should look at data types. So when we learned JavaScript, we spent ages learning that there are different types of data and that you write them in different ways and they have different meanings.
And I just forgot to include that minor fact in these show notes because I'm too used to this and that's why I have you.

[20:11] So let us start by taking a slight step back and reminding ourselves of the JSON syntax because JSON is storing purely values. And so its syntax is all about how do I express a value?
So JSON actually has one, two, three, four, five, six data types.
The first data type is very confusing.
It is an explicit piece of data to say, I mean nothing, only to mean nothing explicitly. And that is null.
So null is, there is one value of type null, that is the value null.
And it means I'm not here. So it doesn't mean zero. It is a data type. It doesn't mean nan.
— It means null. — Exactly. It means null. It means nothingness as a thing.

[20:59] It can go in an array. The first element of an array can be null.
— Well, in 1947, the prize went to null.
— Yeah, the laureates were null, because there is no laureates, therefore null. Yeah.
The other data type JSON understands is booleans, and they have the two keywords true and false.
A boolean is either true or false. and so you write it in JSON as t-r-u-e or f-a-l-s-e.
JSON also understands numbers, and we write those in the way we westerners write numbers.
So digits, perhaps with periods, perhaps with a minus sign.
We just write the number. Not quoted, we just write the number.
That makes it a number in JSON.
Then we have strings, and JSON is very strict on strings. So in JavaScript land, we could use single quotes or double quotes, and it didn't make any difference.
JSON is way stricter. It's double quotes or nothing.
It's just double quotes. No choice.
JSON then has the syntax for an array, which is open a square bracket, one or more values separated by commas, close the square bracket.
And any value can be in an array. So you can have null, false, 11, the string, something. They can all go in an array.
And you can put an array in an array.

[22:11] And infinity ensues. And the last thing you can have in JSON is a dictionary, which is open curly bracket, the name of the key as a string, colon, the value, which could be a string, it could be a number, it could be an array, it could be null, it could be another object, or sorry, another dictionary.
So you can have dictionaries all the way down, and arrays, and you can nest them together ad infinitum, and hey presto, you have a database of Nobel laureates or whatever you'd like, right?
Those six atoms do it all. So they are the 6 data types in JSON.
JQ is a querying language for JSON. Thank goodness the authors of JQ didn't try to reinvent any wheels.
If you would like to represent NULL as a value in JQ it is N-U-L-L.
If you would like a boolean it is T-R-U-E or F-A-L-S-E. If you want a number it's the digits, perhaps with a period, perhaps with a minus sign.
And if you want a string, it is double quotes or go home.
So the rules are the same.
Phew! So that now means that we can tell, oh, that's a number, that's a string, that's null, that's a Boolean.
Because the rules are the same as they are in JSON. I like the strictness of this, and the structure, and it's clean.

[23:29] Yes, yeah. JSON is a subset of JavaScript that is very strict, right? The J in JSON stands for JavaScript.
Okay. It's JavaScript Object Notation, JSON. Oh. Oh, okay. Didn't know that. There you go.
Okay, so the next thing is, in order for me to show you JSON operators in action, I actually am going to take a value, an operator, and a value, which means that the input to the jq command is nothing.
I just want to run a filter that has a value, an operator, and a value, and I want to see what that evaluates to, which is a perfectly valid thing to do.

[24:07] But jq kind of assumes that its job is to process data, so it kind of expects to be handed some.
So its default behavior, if you don't give it any data, is to assume you're going to type it on the keyboard, and it sits there and waits for you to type, which is not actually what we want.
So you can tell it, no, no, no, I really did mean you not to have any input, but the long flag minus minus null minus input, or the much more convenient minus n, which in my head I think of as minus no input, but it's technically null input.
So if we want to see the operators in practice, we can say jq minus n, and then we can just run a filter that, two values, an operator, etc. So from our point of view today, the most important operators to learn about are the comparison operators.
Because we're going to compare things to each other.

[25:00] And the comparison operators always return a boolean.
So if you say is this equal to this, it will be true or false.
If you say is this not equal to this, it will be true or false.
Is this less than this, greater than this, it's always going to be a boolean.
The only output that comes out of a comparison operator is true or false.
And we have all of the operators you would come to expect. We have double equals for is this equal to?
We have exclamation mark equal to or bang equal to, depending on which side of the Atlantic you're from, for not equal to.
We have a single Chevron for less than, the opposite Chevron for greater than, and we have Chevron equals and the other Chevron equals for less than or equal to and greater than or equal to.
That's funny, you're calling them Chevrons. I think of them as the less than or greater than symbols.
I know, so do I, but I couldn't say that because the less than symbol for less than sounds a bit silly.

[25:52] Or the angle brackets, as I also sometimes call them. So if you look in the show notes, you will see all of the commands, and I have put in a comment at the end of the terminal command the output it produces.
So if you say jq minus n, and then in single quotes, waffles as a string, double equals Waffles as a string, shock and or horror, jq will tell you TRUE.
Waffles do indeed equal waffles.
If you say waffles as a string double equals PANCAKES as a string, jq will quite correctly tell you that they are not the same. False.
For not equal to, very sensibly, waffles is not equal to waffles returns false.
Waffles is not equal to pancakes returns true, because waffles are indeed not pancakes.
Less than works the same. Greater than, greater than or equal to.
They all do exactly what you think.
But, but, but, there is a subtlety to double equals.
So if you remember all the way back to JavaScript in installment, I don't know, 14 or something, I actually think it is genuinely that long ago, we learned that in JavaScript we have two forms of checking for equality.
We have double equals and triple equals.
And double equals was like... Triple was like the strict, like is identical to, right?
So the number 42 is equal to the string 42. Is that the one that double equals would be true, but triple equal would be false?

[27:16] Yes, yes, that's it perfectly in JavaScript.
In JQ, you are always in the strict mode. So in JQ, double equals means what triple equals means in JavaScript.
So like you were saying, JQ is a very strict language. Well, its equality is strict. So, if you say jq-n the digits 42 double equals the digit 2, quite rightly that's false.
If you say jq-n the digits 42 double equals 42, quite true, great.
If you say jq-n the digit 42 double equals the string 42, false.
They're not considered the same. It's a subtlety.
And this is important for us because in our data set, the years are strings because the people who... And I actually downloaded it from the Nobel Commission's website, from the Nobel Committee's website.
The data set comes from the actual Nobel people, and they aren't very good data scientists because they have encoded their numbers as strings. But anyway.
Well, they did try... They started a long time ago, to be fair.
I don't think the JSON data set goes back to 1901.
Just a hunch. But maybe that's how it was typed out or something on paper, parchment or something.

[28:35] For the IBM punch card. Actually, they probably existed back then, you know.
International business machines were a thing back then. Anyway, the second type of operator I want to share with you today is the obvious companion to the comparison operators. It is the Boolean operators.
So basically, this is your AND and your OR and your NOT. only there is no not operator in jq, because not is unary, right? Not doesn't take two inputs.
That doesn't make sense. You not a single thing.
Oh, right. Yeah. So not is not an operator, but it does exist, which is going to be our transition point of functions.
So we have and and or available to us as our Boolean functions, ones, and they will reproduce true or false.
And this brings us to another thing that every language has to grapple with, and every language gets to make up its own rules.

[29:32] If you compare a Boolean to a Boolean, then Boolean logic is the only thing in play, and it's really obvious what will happen.
Everything just obeys the truth table George... Was he George or Robert?
Mr. Boole in Cork, Ireland, invented about a century ago.
They're very fond of Mr. Boole down in Cork. They have lots of buildings named after him down there. Very pretty campus.
Anyway, Boole gave us the rules, true and true is true, true and false is false, false and true is false.
We have these simple rules. so as long as you're dealing with booleans all the way down, it's all easy.
But if you say true and 42, or true and an empty string, the language has to decide how to handle this.
Some languages handle this by shouting at you and giving you a giant big error, and telling you that the boolean operators can only handle true and false.
Don't you dare give these boolean operators a string to work on.
Most languages are more forgiving, and they have a set of rules that says if I'm forced to treat something like a Boolean that isn't a Boolean I will apply this algorithm to figure out what that is. Is it true or is it false?
And so in JavaScript we spent a lot of time on this because JavaScript rules are quite... they're complex but useful.
So we use the term truthy and falsy for things that we converted to true or false without ever explicitly writing them as true or false We said that the number 42 is truthy because it evaluates to true.

[31:02] Well JQ has a much much stricter approach to these things. The rule in JQ is so simple it sounds complicated.
Every single possible value, apart from false and null, are true.

[31:14] Everything that isn't FALSE or NULL is true.
That's very counterintuitive. Because that means an empty array is true.
JavaScript, no. JavaScript, they were false. It was really useful to tell if we had no inputs or something.
We would say if, the name of the array, and it would be false if it was empty. Really useful.
An empty dictionary. JavaScript, it was false, which was very useful.
JQ, no, it's true, because it's not null or false, so it's true.
The empty string, true, JavaScript called it false. The number zero, everywhere in computer science, calls the number zero false. Not JQ.
No, it's not F-A-L-S-E or N-U-L-L. It's true.
So everything that isn't null or false is true. And that is very counterintuitive.
Right, right. And you do lose some utility with that, like you said, right?
I argue you do, yeah. Now, there are functions we will learn about next time for checking all sorts of things that will get us around these…strange choices.

[32:26] Right, it took me a whole episode to explain how JavaScript does it.
It took me like a sentence that's so simple it's deceptive because you just, it can't, what do you mean it's that simple? It can't be. How can zero be true?
Because it's not false or null. I'm going to pull back the curtain.
When I read the show notes about three or four days ago I was writing to Bart going, this doesn't make any sense. What are you talking about?
And he kept saying the same thing over and over and over again.
I'm going, yeah, no, But that's that's not my question." And you're like, no, that is your question.
False and null. That's it. F-A-L-S-E. But you can't say F-A-L-S-E when you're typing because it's already that way.
And it took us probably six times before we went, I went, oh, oh, you mean exactly that.
Yeah, exactly. It's too simple. It can't be right. You flesh this out quite a bit to make that as clear as possible.
So if anybody's reading this and not having Bert describe it and say it in these words, this is really good now.

[33:25] Yeah, and so we have commands in the show notes to prove what I've just said to you.
So again, we're using our JQ minus N, so we can say true and true is true, true and false is false, so far so good.
True and null, false. Yay! True and 42, true. True and zero, true.
True and waffles, true. True and false, that's the string. So the string, double quote, F-A-L-S-E, That's a string. True.
The empty string. True.
If we want to test an array or something, we need to get a bit cleverer.
So instead of me using JQ minus N, I went back to the old thing we learned a million times when we were doing our various things in shell script.
Echo the string I want, pipe it into the JQ command. So I am echoing the JSON syntax for the array false comma zero comma no.
And I am piping that into JQ and JQ says true. So I give it the jq string true and dot, dot representing the input.
So dot is our array and jq says, yep, that's true.
Because that array is not false or null. False or null and null are true.
Yeah. Also, the empty array, true.
We can pass it a dictionary, breakfast colon pancakes, dessert colon waffles, which I think is a good day.
Pipe that to jq, it says true. The The empty dictionary. True.
Everything that's not null or false is true. It's shocking. But it's true.
If you'll excuse the terrible pun.

[34:54] Right. So our third new thing. Functions in jq.
So in jq, like in any other language, functions can take multiple arguments.
But actually I should step back a sec. Because that most languages were worried about telling the function.
Any amount of input. But in jq you get input for free because a jq filter processes something.

[35:21] Dot is the input being processed. So every jq function has access to dot without you doing anything.
So you don't have to explicitly tell it to work on dot. That's a given.
So arguments are only needed...
This is why we use video. The only reason he knew to say go on is he saw me going, wait a minute, wait a minute.
So when you said every function has access to dot, is that even like after a pipe?
Would dot still be the original input, or would it be the input from that before that pipe?

[35:58] Right, so dot has always, dot exists within a specific filter.
So when you pipe one filter to another, each side of the filter has a different dot, right? So a few... So .prizes and then .surname, right? Right, OK. Exactly. OK.
Yeah, so the function just automatically is processing your thing, which kind of makes sense, right? You're taking some data, running it through a filter to get some more data.
So you don't actually have to explicitly say, have the data I want you to process.
It just gets that for free, right? Just by the nature of JQ's raison d'etre.
So a lot of functions don't need arguments. It's like if you want to not something, thing. You just pipe it to the function, not.
And then it will flip it around. So it will say, are you already a boolean?
If you're not a boolean, I'll make you a boolean, and then I'll flip you around.
So if I pipe you null, I'll get true. If I pipe you false, I'll get true.
And everything else in the world, the opposite of true is false. I'll get false.

[36:56] Right? So a function that doesn't need any more information, you literally just give it its name, and that's it.
And it will work on the current input and do what it says.
So not, you can just pipe not. That's the entire filter. The entire filter is just not. The entire function.
Or is it a filter? Well, no. So you use functions within a filter, right? The filter is, like, on the terminal, everything is a command. Okay.
Okay. In a programming language, everything is a statement. In JQ, everything is a filter.
Okay, so this is JQ minus N, quote, single quote, true and true, we're going to pipe it to not, which is a function that is going to be our filter.
Exactly. Yes, so that function is our filter. True and true, pipe to not is false.

[37:44] Exactly, because it's now just been inverted, right? True and true is true, pipe to not, false.
I got a lot of traction on Mastodon when I posted that my favorite thing about programming is when you have something like this that makes complete sense.
JQ minus N, true and true, pipe to not, false.
And I went, I understand, exactly.

[38:07] You got a reply from one of the authors, one of the people who's contributed to the JQ project on Git, which I thought was very nice.
That was really, really cool. Yeah, and luckily I put in my post, I put learning from Bart and tagged Bart on it, and I put a link to pbs.bartifister.net.
And I think he said something about, ooh, something on JQ I didn't know about was out there. So he made note of it. So that was really fun.
Which I guess from his point of view, hey, there's people learning my thing.
That's gotta be fun too. Yeah, yeah. Circle of happiness.
Exactly. So if you do need arguments, if you do need more information, like maybe you have a function to add a string onto another string, well then the other string is not going to be your input, so you're going to need an argument. So sometimes you do need arguments.
And so you can give a function arguments, and thankfully, JQ has inherited most of the syntax from every other language we've met, it uses roundy brackets to say here's my argument list.
Unfortunately, the people who wrote jq painted themselves into a wee bit of a corner.
Because normally the comma symbol represents this is the next argument.

[39:21] But the comma symbol is already in use. It's how you separate multiple filters from each other.
And a filter is a valid argument. A function in jq can take a filter as an argument.
Which is one of those snake-eating-its-own-tail things, but it's the reason jq is stupendously powerful.
You can pass a filter as an argument. It's like a callback in JavaScript, right? That's a function as an argument to a function.
Well in jq, it's a filter as an argument to a filter.
You've completely lost me, by the way, because you said that functions were filters, and then you said... And I'm the snake eating my own tail here. I didn't follow that.
Okay, so a filter is a thing you want JQ to do.
Calling a function is a thing JQ can do. So a call to a function can be your filter.
Your filter could also be .name. That's also a filter, right?
So you call a function within a filter. Like you call a JavaScript function within a JavaScript statement.
But they're not synonymous. A statement and a function are not the same. Sure.
Right? So you could say, one plus call a function. Right.

[40:37] And then the answer would actually be one more than you thought, because the filter is one plus the function call. So the function call doesn't have to be the whole filter.
Okay. Let's keep going and go through your example. Yeah, let's keep going. Probably makes sense.
Well, also, some of this stuff won't make sense until next week, because I'm saying a lot. OK. But it is important to say that they couldn't use the comma.
The point I'm trying to get to is the comma is taken. They have given the comma a meaning.
So to separate our arguments, we use the semicolon.
That's very different to every other language we've ever met.
Semicolon means next argument. Not end of statement. Next argument.
OK. Very different to anything we've ever met. So that's why I'm making a real point of calling it out. You separate your arguments with the semicolon.

[41:24] So, we basically say name a function all by itself, or name a function, open round bracket, one argument, close round bracket, or name a function, open round bracket, first argument, semicolon, second argument, maybe semicolon, third argument, as long as we like, close the roundy bracket.
And as we've already talked through, our first example is the not function, which is just pipe something to not, and it will just invert it.
That's all there is to it.
Now, not is already quite useful, but jq has two really powerful boolean-like functions, any and all.
And you will end up using these a lot.
That seems like an obvious thing. Like if I do, I want to make a smart folder in Apple photos, I'm going to use the little dropdown to change any to all.
Those are, that seems like a classic filter thing to do.
It really is. So the any and the all functions, both of them come actually in three flavors.
I'm going to teach you two of the flavors now and one of the flavors we'll come to later.
So the first flavor is the most simplistic flavor, no arguments.
And the input to any and all has to be an array.
So the whole point is that they work on many values. If you think about it, that's kind of their job. Aren't there many values in a dictionary?
But they can't do key value pairs?

[42:51] Okay, maybe I shouldn't be quite so categoric. For today, let us pretend that the only thing they can handle is arrays.
Yeah, they can. But they do... Yeah. For now, let's just keep it simple. Arrays. Okay.

[43:05] And so in its most basic form, without any arguments, if you give it an array, it's going to convert every value in the array to true or false based on the rule we learned above. If it isn't false or null, it's true.
And if all of them evaluate to true, then the all function returns true.
Otherwise, the all function is always false, right? If even one of them is not true, the all function will return false. The any function is its opposite number.
If any one of them is true, then the any function returns true.
So they do what they say on the tin.
And we can see that in action with some sample calls to echo, and then we pipe it the array false, false, false.
And then we pipe that to the two filters, any, all. So we're saying run this filter and also this filter, because that way I have half as much typing to do.

[43:53] So if we do that, false, false, false, we run it through any, we get false, we run it through all, we get false.
False, true, false. And he says, yeah, that's true. I got one true. I'm good.
But the all function is like, no, one of those, there's two of those are false. No, no, no.
If we send the true, true, true, then any and all will both be true.
That makes sense. You know, they behave like the say on the tin.
So that's the zero argument version. And that is already quite useful.
But where we really hit some power, we have a two argument version.
And this is where I get to explain what I mean by you can pass a filter as an argument.
So the one argument version of all expects to be handed a filter.
It will apply the filter to each element in the array, and it will do its final decision based on the result of the filter.

[44:46] So you're applying the filter once for everything in the array, and then the output of that filter is what you then any or all.
So it's an extra level of indirection. So let's look at an example, because this is way harder to say than to just do an example.
So our example is going to use all with the argument dot greater than or equal to zero. So the argument is the filter, dot, greater than or equal to zero.
The input we are going to send is the array 42, 3.1415, 11.
So what will happen is, 42 will be compared to zero to produce a boolean.
Is 42 greater than or equal to zero? True.
3.1415 greater than or equal to zero? True. 11 greater than or equal to zero? True.
So we have true, true, true. All c is true, true, true, returns true.
Okay, let me talk through the syntax here for the people listening.
It says echo, and then single quotes around our array, 42, 3.1415, 11.
Then he pipes it to jq, and then again in single quotes, all roundy brackets, that tells us these are arguments, dot, which is the input that we just got, which is that array, greater than or equal to zero, close roundy brackets.

[46:11] Now, in this case, because the documentation of all tells you this, you can't tell by looking, you can only tell if you read the manual, dot is not going to be the array.
Dot is going to be the first element, then the second element, then the third element, because what the documentation says is that all will iterate over its input.
Okay, I guess that makes sense. I was kind of looking for you to say the word explode, because we didn't explicitly explode the array, but it's going to do that by iterating.
Yeah, exactly. So the documentation for all says that it will iterate.
So we don't explode it, all explodes it for us.
And then gets a list of booleans, and then does the all thing.
So it's one level of indirection.

[46:56] Right, right. We told it what we want to do, what we wanted to do too.
And if they all turn out to be true, then we're true. Otherwise, we're false.
That's very powerful for validating a bunch of stuff. I need all of these to be strings, or I'm not happy.
Just, you know, we haven't met it yet, but as a function isString, so you just give us the argument all isString, and it will just tell you true or false, they're all strings or they're not. Okay.
So before you even take another step, okay, you got a problem here. We're done.
Yeah, yeah, we're done. Exactly. So very powerful.
Another nice simple function that's darn useful is length.
Length doesn't need any arguments, it will just count. And it's actually quite clever. So, if you give it a string, it will count the characters.
And it does it like a human, which is very pleasing, because a lot of programming languages do it like a computer.
And so, if you pass it the string, passe, that is actually six characters, because it's P-A-S-S-E, and then the accent.
Because they're actually under the hood, the accent is a separate character that gets rendered on top of the E.
But JQ is, no, no, that's five.

[48:04] You can give it an emoji for a stack of pancakes, which is what that emoji is, and it will tell you that's a length of one.
You know, when I read this, I immediately stopped and said, I got to tell Bart that's not right.
I know that the more complex ones are more than one digit, which is more than one length.
And that's one of the weirdest things. That was a lot of the earliest emojis are only one, right? Like a smiley face is only one digit, one length of one.
Right, but if you do a smiley face with a skin colour that's not yellow, it'll be more than one, because the second code point is the skin colour.
That's because we didn't used to know that people had different coloured skin.

[48:44] Or the middle-aged white men who wrote this stuff didn't know that.
Yeah. Anyway, let's not go there.
If you give it an array, it will count the elements in the array.
If you give it a dictionary, it will count the pairs. So, if you give it a dictionary with two key-value pairs, the length is 2.
Not 4, 2. Which is more sensible, frankly.

[49:09] So, all of this, everything I've been doing today, has been leading up to the one thing I wanted to tell you about, which is the function for searching for things.
And we in Programming by Stealth, although we've been going for 156.75 episodes, because we're three quarters of the way at least here, more, we have not done databases.
But an awful, awful, awful, awful, awful lot of programmers have done databases.
And you know that the keyword in the standard querying language is SELECT.
That's how you query a database. Select something from something where something.
Well, the function in jq is called SELECT, which is very pleasing.
I liked when I saw the word SELECT.
Now SELECT is a function.
And it needs an argument, which is basically the piece of logic you would like me to apply.
So SELECT takes as an argument a filter. And it does the simplest thing in the world.
If the filter returns true, select returns the value unchanged.
If the filter returns false, select silently swallows the information.

[50:16] So the effect of piping through select is that anything that matches comes out the other side and anything that doesn't match disappears.
So if you start with your dot prizes array, and you pipe that into select, wherever the condition matches comes out the other side and continues on to your next filter, where you might say dot name or something.
But anything that didn't meet the criteria is annihilated, destroyed, evaporated, wherever you want to think about it, it disappears.
So that would get interesting in the database for the Nobel prizes for the years during the world war that you would be running into some nulls and therefore those might be not there those are false.

[51:02] Right, which means they'll be silently swallowed, so that's okay.
That's what I mean. I mean, it can be used as a way to eliminate those. Yes.
It absolutely could, yes. But that is all it does. That's spectacularly simple.
If my filter returns true, then I will pass the value unchanged.
If my filter returns anything else, i.e. false, then my filter will just silently absorb the value.
And so it is the most filter-like filter. When you describe a filter, you take some input and you make less of it appear on the other side.
Select is the ultimate filter.
Right. I'm a little bit confused by you saying it passes the value.
So in your example, is it okay if I read that now?

[51:45] Absolutely. So it says jq, single quote, dot prizes, square bracket.
So we're going to explode dot prizes.
He's piping it to select, open round bracket, dot year, equals equals, quote 2000, unquote, because they use strings, close single quote.
So to me that says I'm going to explode prizes and I'm going to select only the years that are equal to 2000, but it doesn't tell me what it's going to spit out.
Is it going to just spit out 2000, 2000, 2000, 2000?
No. What is that value? Okay, good. Thank you. Okay, so the value is dot. So all of dot.
So the entire input output comes out if the condition is met.
So we're saying the condition is whether or not the year is 2000, but the thing that came in was the whole prize. Okay.
So the thing that comes out is either the whole prize or nothing at all.

[52:41] No, it should send out the prize where the year was 2000.
Okay, so or nothing at all. So all the prizes come in, and so for each individual prize, we do a check, is the year 2000?
If it is not, that prize evaporates.
So it comes in, and then it evaporates.
When the year is 2000, it comes out.
So it's another JSON file structured exactly like dot prizes, except all that's inside dot prizes will be the ones from year 2000?
Exactly, only they're coming out as lots of separate ones. So they went in as one array, we exploded the array into separate values, so what will come out will be a dictionary, another dictionary, another dictionary.
Okay, okay, so it won't be dot prizes when it gets on the other side because we exploded it.

[53:31] Exactly, yeah, exactly. So there's no going backwards in time, right?
If you do something and then you pipe it somewhere else, somewhere else, there's no idea where you came from. I'm getting a little hung up on what you mean by the value gets returned. Like, what is the value of .prices square bracket?
Okay, so .prices square bracket means explode .prices into one different...
So everything in that array becomes a single thing, another single thing, another single thing.
So the SELECT statement happens once for the first dictionary inside .prices, once for the second dictionary inside .prices, because that's the act of exploding.
Right. So the second filter happens once for everything in that prizes, because we exploded it.
Right, so it's looking for all the ones that match this select.year == 2000, and the only ones it's going to pass through.
So it's going to be a list of all these dictionary items that are just from the year 2000.
And you're calling that the value. That's just... That word sounds funny.

[54:36] Well, that is the right word because that's the whole point, but it's a specific piece of data, right?
So dot is the thing I'm working on now, right?
So select will pass through the entire, you give it a thing, it will either give you back exactly that thing or nothing at all.
It doesn't alter the thing. It's either yes or no.
But it's not really giving you the thing that came in. It's giving you a subset of the same thing.
We have exploded it. And what I sent in was all prizes.
So it's a giant dictionary. No, no, but you sent one by one.
So that's a dictionary. Oh, OK.
OK, so for each one that comes in, it comes out the other side if it matches 2,000. If it doesn't have 2,000, it gets swallowed.

[55:20] Exactly. Because remember, they're happening in parallel, right?
So we have exploded it, so the middle one happens in parallel.
Yeah. OK. Yeah, exactly. So it's very simple. Penny drop. We either pass it or we pass nothing. Right.
But it's a very important penny, right? And so the argument is just the condition, and if the condition is met, the exact thing passes through unchanged.
Otherwise, we'll lose it.
Nothing comes out. So when you run it, you will see the prizes for 2000, which is kind of cool.

[55:48] Now we can break things down. Right, so let's say we want just the prizes for medicine.
Well, we just chain them together, right? The whole point of these, remember I said the filters, you just chain them together. don't try to do two things at once.
So you could just chain them together. So you can say .prizes() pipeSelect.year="2000", pipeSelect.category="medicine". And that will tell you the 2000 prize for medicine.
But of course we can use the and operator. So it is equally valid to say .prizes() pipeSelect.year="2000", and .category="medicine".
And that's all in one Exactly. A giant roundy bracket.
Because that whole big thing is now the argument getting past the select.
Right? Dot year equals and dot category equals is now the thing being passed to select. Exactly.
Now, what if we just want to see the laureates and not all of that detail?
Right? Because the select is passing the whole prize.
Well, just stick another pipe on the end. Dot laureates.

[56:52] Explode those out. Now, you're just going to get the name of the winners, winners, or you're just going to get the objects representing the winners, in the price for medicine for the year 2000.
So you can use your select in the middle of a giant big string of pipes.
It's just, it's a filter on that point. And then you can continue to chain and chain and chain.
Because that's, that's what we're doing here is we're just chaining things together, chaining things together, chaining things together.
Now the last place I want to go today is the really fun part.
So I told you that there was a third version of Annie and All, which takes three arguments.
Wait, no, two arguments. Works in three files. Yes, two arguments. Sorry.

[57:33] So let us say that we now want to answer the actual question I asked at the start of the show.
I would like the Nobel Prizes won by someone with the surname Curie.
So, based on what I've just said, a naïve first attempt would be, we say .prizes, square bracket, square bracket, to explode the prizes.

[57:56] We pipe that to .laureates, to say that for each prize, just give me the laureates.
And then we pipe that to... Correct, yes, sorry. Very good correction.
So we now have a list of names.
Sorry, a list of dictionaries that contain surname, first name, and motivation.
And then we run those through a SELECT .surname==curie. And that is absolutely going to find just things with the surname Curie.
But when you run it, well actually the first thing that happens is we get an error. Cannot iterate over null null.
Ah. Poop. Okay well I realise now that I wrote these show notes slightly out of order and I've already given you the answer to this question.
The reason is because we have that optional, because sometimes there are no winners.
But now I'm going to show you how to find the years with no winners.
Because last time we worked around the problem by just throwing in the question mark, but now I'm curious. Well, when was there no Nobel Prize and why?
So to do that we can use SELECT.

[59:05] And this time our criteria is we want to have the prizes where the length of the lariats is zero.
In other words, where there are no lariats. Because in the years where there are prizes, lariats is an array with a length greater than zero.
When there are no lariats, that is null, and the length of null is zero.

[59:25] Because there's nothing, right? Okay, okay. Yeah.
So we say .prizes, explode it, pipe that to select, and then inside select we are actually using our brackets to say .laureates pipe length, double equals zero.
So we're getting the length of the laureates and then checking for zero.

[59:50] Let me read this again. I'm running into brain freeze on the parentheses.
Okay, jq, open single quote, dot prizes, square bracket, so we're exploding prizes, piping that to a select.
Within the select, we've got a filter dot laureates pipe to length, closed roundy bracket w equals zero. So that's going to give us a number.
So you've taken laureates and cut out the length. In other words, we want to compare the length.
Why has laureates not got square brackets on it?
Well because we want the length of the array. If we explode it, the length will be how many keys are there in each Laureate.
We don't want to explode them, right? We want the length of the unexploded thing.
All right, so we're going to iterate over Laureates, which is just one thing, find out its length.
And then if that is equal to equal zero, it'll pass something out.
What will it pass out? It'll come out, right? So what will come out?
Okay, so .prizes is what gets exploded to the select, so the answer is it will show us the prize.
So what we will then see is the prizes that didn't have winners, and what you will see when you run that command is that there's a new key in those dictionaries.
Overall motivation. And it will tell you why there were no Nobel laureates in those years.
Huh.

[1:01:16] It's not just during World War II, 1924, 1923, 1921, 1919, 1918, 1970, 1960, yeah, 1914.
Yeah, 1914 is the start of World War I, so 1418 covers World War I, and then they didn't quite get going again straight away, probably because they had the Spanish flu and lots of stuff.
I don't know what happened in 23, my history isn't good enough.
Well, it's still gone in 25, still gone in 20, then it's there for 26 and 27, but gone again in 28, gone again in 31.
So I'm not sure what was going on in the world then, but 28 is the Wall Street crash, that seems like a strange reason to cancel the Nobel prizes. No, no, yeah, right.
31, 32, 33.

[1:02:01] Yeah, there's a lot more reasons I think that it was turned off.
But they all say the same thing. Yeah.
Yeah. Anyways, so there you go. So I didn't know why my queries were failing, and so I decided to use JQ to answer the question.
Why is this data structure not the shape I think it is? So let me query the data structure to tell me about itself. Okay. So now we know.
Okay, great. All of that was just so that we'd learn to put the question mark at the end, because that's the answer, is to just make it not give an error.
But anyway, So I thought it was worth explaining why sometimes that happens, and it's a good example of how to use length, which is also why I stuck it in the show notes.

[1:02:39] So when we run our Naive query, we get output that tells us each laureate named Curie.
So we see Marie Curie, we see Pierre Curie, and we see Marie Curie, and we can see that in recognition of her service to the advancement of chemistry, in recognition of the extraordinary services they have rendered by their joint researchers.

[1:03:04] But they're just the Laureates. Why are they just the Laureates?
Well if we look at what is the current thing being processed as we go through our chain, we start off and we explode the prizes.
And at that point in time, we have the piece of information we really wanted.
But what did we do to it? We exploded just its Laureates key.
So the only thing that exists now going into that last select is the Laureates.
And we found the right Laureates, but we've lost what came before because we exploded it all away.
C.E.O. Okay, right, right. D.L. So we have the right condition, but we haven't retained the right piece of information.
How do we square this circle? It's an extremely common problem.
You have a piece of data that contains an array of lots of pieces of data, and you're interested in a condition inside the array. query.
This is where any and all are your friends. Because what you really want is a summing up of the values inside that array.
I would like the prize where any laureate is a query.

[1:04:10] But still be the prize, not just the laureate. But be the prize.
Exactly. So the any function is going to be the key here because the any function we can dive in, it will do its work on all of those sub values and come out with one answer. Oh, so any prizes that also have this filter of curie. Okay.
Yeah. Now, so in order to do that, we need that third variant of any, and there is a matching one for all because those two functions are symmetrical.
Now, I told you that if you give it no arguments, it just treats the input as Booleans.
And if you give it one argument, argument, it applies that argument to the inputs to make booleans, and then it gives us our answer.
The third argument adds one more layer of indirection.
You tell it how to make the array.
Wait, we're making an array?

[1:05:08] We're making a list of inputs. Let me use the word list of inputs because we're not strictly speaking making an array.
We are making some inputs, we are specifying what we want to do to those inputs, and then any or all will get applied to all of those answers.
So what we're going to do is explode the lariats, say we want the surname Curie, and any of those is fine.
That sounds exactly the same as what we did before.

[1:05:34] Right, but look at the difference. There's only one pipe.
So you haven't told people what it says yet? Yeah.
Okay, so let's start. So we say dot prizes, open square bracket, close square bracket. So we've exploded the prizes.
So the inputs to that select statement are the entire prize.
Right. That select statement happens once for every prize in the data set.
But it's the entire prize is the current thing.
And notice there is nothing after that select statement. Which means that what's going to fall out the back of that select statement is the prize.
Okay, so let me go ahead and read it for people. So it's jq Starts we explode the prizes.
We pipe it to the select immediately. So we're not going to explode the laureates first We're gonna select open round brackets any Open round brackets again now explode the laureates with our question mark So we get rid of the null ones and then dot surname equals equals curie.
So we've said select any laureates that have the surname Curie.
So that's the two arguments because it's separated by a semicolon.
Exactly. And so what's going to squirt out the other side is still going to be prizes, not laureates, still going to be prizes. That's what's going to squirt out the other side.
Exactly. So it actually answers our question.

[1:06:53] Exactly, exactly, exactly. So this is why Annie with the two argument form is so powerful, because it's let us dive into that laureates array without having to explode it.
We didn't explode it. We just went in, did our query, and it's all still intact. Nothing was blown up.
Which, for the Nobel prizes, given their dynamite relationship, is kind of funny.
Well, we actually do explode it, but we explode it just within looking at any.
We explode it, filter it, but then we take the real prizes.

[1:07:23] Exactly. Exactly. Because it's the full input to select is what select spits out. All or nothing.
Very powerful. It's a subtlety, but it's darn important because this way, what we learn is that there were two prizes with Curie winners.
There were three laureates with the surname Curie, but two prizes.
The first one in the list, because it's in reverse chronological order, is from 1911, where Marie Curie won by herself.
Not only the first woman to win a Nobel Prize, but she won the whole thing.
She was the only winner, and it was a prize in chemistry.
But back in time, in 1903, she actually…sorry, she was the first woman in 1903, but she only got a quarter of the prize in 1903, because it was shared between herself and her husband, and Henri Bacquerel.
And Bacquerel got half, and the Curies split the other half.

[1:08:22] So, and they got it in physics. So not only did she win two prizes, one of them entirely by herself, but she got one in chemistry and one in physics.
You know, I know it's annoying that we know so little about so many female scientists, but there's a reason we know Madame Curie.
There's definitely a reason why we know her.
I mean, it is a running joke, name a female scientist and everyone on planet Earth says Marie Curie, but she didn't get there by default.
She darn well earned that position. physics and chemistry. And she only lived to, how old was it? In her 40s or something?
Well, yeah, because she was working with radioactivity, right?
Like her tomb is still a darn dangerous place. Her notebook is behind God knows how much lead.
And sometime when, you know, in a few thousand years, we might have a look at her notes. And they might become as valuable as da Vinci's notebooks, but for now, they're a health hazard.
You couldn't even photograph them because the radiation would ruin your exposure. Jesus.

[1:09:17] That's kind of crazy. It is kind of crazy. Right.
I'm going to finish this out today with something we have not done in our entire JQ series because we haven't quite had enough meat.
But now I can give you a challenge. We have a data set full of Nobel laureates.
I have some questions and I would like you to answer them with your JQ skills, not by brute force.
Because you could just Google this, But don't. Just Google this.
Use your JQ skills. So I have three questions.
What prize did friend of the show Dr. Andrea Ghez win? I would like to know the year, the category, and the motivation.
How many laureates were there for each prize? I would like you to list the year. Win.

[1:10:01] All of them. Okay. Not in the context of the first question. No, no.
So for every Nobel Prize ever, so year, category, how many? Year, category, how many?
1901, chemistry, I don't know if it's two. Let's pretend it's two.
1902, physics, three, whatever. I don't know the answer. That's why I'm asking the question. But year, category, how many people?
And then which prizes were won outright? In other words, which prizes have exactly one winner?
I would like the year, the category, first name, last name and why?
What were they given the prize for? Justification, do you mean motivation?
I do mean motivation, because I typed it wrong all the way through the show notes and corrected all of them except for one. Okay, I will fix that now.
Yes. So there are three perfectly good English questions, and the great thing about JQ is that we can answer those kind of questions out of structured data. It is a query language.
I love this part. I love this so much.

[1:11:05] Excellent. I can give people a slight look under the sheets here.
I have just planned, I have storyboarded the rest of this series.
So we, as much fun as this is, and as powerful, if we stopped now, this would be very powerful to dealing with web services APIs, right?
The amount of APIs that return JSON is already huge.
So this lets us get a JSON for the current weather and figure out information about it, get a JSON for someone's IP address and get information about it.
This is already spectacularly powerful.
But we get to go further. there. So I have shown you about a twentieth of the functions that exist in JQ.

[1:11:44] So we have seen that we can do things like double equals. Great.
What about more complex searches like regular expressions? Oh yeah, JQ can do those. So we need to go there next time.
And there's lots of other very useful functions. So next time I'm going to ease off on the new concepts, but I'm going to fill in lots and lots of those functions.
So you now know what a function is and what they do, but there's loads of them built in that I'm going to tell you about.
So most of next time is going to be learning about all the other functions, and they're cool and there's many of them.
And so that's going to be the next episode. Then the episode after that, we're going to move on from pulling information out to transforming the information.
You could argue getting the length is already a transformation, but you can do a lot more than just get the length, right?
It's a full programming language, so we can do math, we can manipulate Excel-like, we, So instead of taking one cell and making another cell, we can take one piece of JSON and cack it a whole bunch of things and transform it into a different piece of JSON. Can we give me a Nobel Prize?
Can we change the deal that way? We could, absolutely. Yes, that's quite an easy one.
Find all the Nobel Prizes, make the surname be Sheridan, make the first name be Allison, and spit them out the other side. That's a filter, perfectly valid filter.
We can do that. I want that to be one of the challenges at some point.

[1:13:04] You can have one of them. Oh, good. One random one.
Not in like a real science though. Oh, I want physics, given my background.
So that's going to be the second future episode.
And then our third and final future episode, I'm going to really pull back the covers, because not only can we manipulate data, JQ is a full programming language.
It has variables, conditionals and loops. Oh, really?
I sort of feel like it's already iterating. I would have thought we didn't need loops.
We don't need them for some things. We don't need them to iterate over the things we have been given.
But maybe we want to loop over something that we didn't get from the data.
Maybe we need to do a loop based on something else.
Explode something by 10 times or something.
Well, 10 copies of that array that we got from over here. I don't know why I'm making this up as I go along here, but there are times you want to do more than just iterate over what you've been given, you may want to loop over something of your making.
You may want variables. You may want to capture things into variables and start accumulating a variable based on the data and then spitting out some sort of fancy variable.
It might let you do fun things like give me all prizes with above average number of winners. Oh yeah.
That's actually quite complicated because you've got to go through all the prizes to figure out the average and then go back and find all of the ones where you're greater than the average.
Right, right. that involves a variable.

[1:14:33] Get the average, remember the average, and now go do it again.

[1:14:37] So anyway, there's lots of stuff to learn here. So that is going to run us up until.
Well, so depend actually, I have some time off coming up, so maybe maybe we'll go more than once every two weeks. No.

[1:14:50] Yeah, we'll see. Sorry, I just discovered my my coffee cup is squeaky.
Apology, folks. I'm also out of caffeine. This is bad. But anyway, I didn't mean to squeak at you.
So anyway, and there are three episodes left.
And that is what we will be learning in those three episodes.
And I am... It's very rare that I'm disorganised, and the reason I'm disorganised is because I'm disinterested.
I am having as much fun as Alison is in this series, which is great, because when I started it, I knew just enough JQ to be really frustrated, and now I'm just in love. It's such a powerful language.
You can tell Bart was excited about this, because I think you sent me the notes on Tuesday or Wednesday, and I don't think that's ever happened. It was Tuesday.
I think it's usually last key typed right before you take off on your bike, and then when you come back, we record.

[1:15:37] That is the usual thing on a Saturday. Frantically type, hit commit, go cycle, go eat, record with Allison.
No, I was done on Tuesday. Which means I've been working ahead.
Anyway, I'm going to call it there because my dessert is in the oven at this point in time and we have had what I hope was a fun and interesting show.
I had a blast. Good. Well, until next time, happy computing.
If you learn as much from Bart each week as I do, I'd like you to go over to lets-talk.ie and press one of the buttons over there to help support him.
He does 98% of the work here, I'm just the stooge that listens to him and asks the dumb questions.
If you go over to lets-talk.ie, you can support him on Patreon, you can donate via PayPal, or you can use one of his referral links.
I really hope you'll go over and help him out. In the meantime, you can contact me at Podfeet, or check out all of the shows we do over there over at podfeet.com.

[1:16:31] Music.

CCATP_2023_12_09

Transcript