Transcript
[0:00] Music.
[0:08] Well, it's that time of the week again. It's time for Chitchat Across the Pond. This is episode number 775 for September 2nd, 2023. And I'm your host, Alison Sheridan. This week, our guest is Bart Bouchats, back with programming by Stealth, installment 154 of X. We're going to finish something off here, right, Bart? We are. We are ending our journey into Bash by mostly relearning the same thing, but this time, the way the ducks do it. We're going to we're going to learn that there's actually official words for things we've just sort of been doing. And I am hoping that in my mind, I've given you all of the pieces and haven't told you it's a jigsaw. And today we're going to put them all together and I have one or two pieces I forgot to give you as we were going along. So but I think almost everything here between taming the terminal and here is not new, but we are going to put it together. And although I'm going to start immediately with something that I think is.
[1:08] Mostly new. But anyway, the first thing of course is we have a challenge solution.
So in the previous installment, I gave you the very simplistic challenge of looking at our previous solution to our printing out the n times tables. And I asked you to check your code for duplication, because that's obviously bad software engineering smell. And now that we know know how to do functions in bash, replace your duplication with a function.
And I scoured my code and I found one fairly small piece of duplication, I was continuously checking for whether or not something is an integer, because there was a lot of integers needed in that challenge.
[1:47] And I was doing it by echoing whatever it is, piping it to egrep minus q and then running it over the same regular expression over and over again.
And it's not the world's longest code, but it also is distinctly not legible.
If echo opdarg pipe egrep minus q $int or e, it's not... oh yeah, is it an int?
So actually, it does kinda make sense to replace it with a function.
So I wrote a very simple function called isint, which just takes $1 and shoves it through the same grep.
But it meant that I can replace my if statements with just if isint $opdarg, which is shorter, and I think more importantly, when I look at that in six months' time, I go, oh, that's That's what I was doing. And there's value in that.
Right, right.
So yeah, as I say, not the world's most exciting function, but there we go. An example and full details in the show notes, pbs153channelinstitution.sh.
Perfect.
[2:46] So my guiding light for putting this show together was your, not just your, everyone's confusion with brackets in Bash, because there are only so many brackets on the keyboard.
And Bash wants to do more things than there are brackets on the keyboard. And therefore it tends to use them in many different ways. And they have very different meanings depending on how they are used. And so I started off with a table, and then I filled all the rest of the show notes in around that table.
Okay.
And one of the things that we're going to start with is a concept that I think we have probably seen but not particularly focused on, which is the concept of command grouping.
So there are times when you might want to run multiple bash commands as if they were button.
[3:38] And you can actually do it in two different ways, depending on how you feel about variable scope.
So, the two most common use cases for wanting to group commands is because you need to take all the output from five or six commands and pipe all of it to the one file, when you could put a pipe on every line.
That's a pain in the backside. What you can do instead is group them, and put the pipe at the end of the group, and then the pipe will affect...
Or the redirect, like arrow to file, will affect the entire group.
So you don't have to do it over and over and over again. Do they execute in order?
Yeah. Okay. So basically, inside the group, standard out is simply the other thing.
[4:25] Okay, so standard out is the one from before? I mean, standard in, standard out.
The first one's standard. All of the streams for everything in the group are the same.
Are the same, so if you do an arrow of any kind to the group, you're affecting them all.
Okay. But my question is, if you've got them grouped, does it know that you take the output from the first standard out from the first one, become standard into the second one, become standard out to the next one?
No. So it's...
Are they sequential or in parallel?
They are sequential, and all of their output is handled as if it all came from one command.
Still missing my point, my point is, is A go to B, go to C, go to D, and then it arrows into...
Is that where the pipes are, or are they... Does it apply a pipe between them?
No, it doesn't apply a pipe between them. So for all of the commands in the group, standard out is whatever you do at the end of the group.
[5:35] Definitely not answering my question. Not even close yet. Is it function A gets piped, and function B gets piped, and function C gets piped, they all go piped to the same place?
Then they must be sequential. One feeding into the next. Absolutely. Sorry, I definitely said sequential. I thought sequential. Did I not say sequential?
But then you said the opposite afterwards. I didn't follow whether you answered the question.
Maybe we should go through the example, and I'll know what you mean.
Okay, so your two use cases are one of them is for just piping it all together, right?
Doing the plumbing as one instead of lots of separate pieces of plumbing.
And the other reason is because you might want to temporarily have a variable, mess, with it and not affect the rest of your code. Maybe $IFS.
Maybe you want to have a different $IFS for just these few statements.
And so you can do that by creating a sub-shell for your group of commands.
You muck around with $ifs inside your sub-shell and you read from files with a weird delimiter and do whatever you want and there's no spooky action in the distance, because the moment the group finishes, your funny $ifs evaporate.
[6:48] So you basically have a little scope. I'm gonna have you repeat that, it'll probably be fine on your recording, but you said as as soon as it finishes and then you froze for a second.
Say that again. So, basically...
Yeah, so you're running it in a subshell, so you get a copy of all the variables, but you're not...
It's a copy of the variables. So any changes you make only exist within the group.
And so when the group stops running, all of your changes evaporate into nothingness.
So you will get a copy of whatever $ifs was before your group started, and then you can change $ifs to the tilde character to be a weird delimiter.
And then once the group finishes, your weird $ifs evaporates, because it only existed in the little subshell. Okay.
[7:40] So it's really convenient to quickly use one of the standard variables without spooky action at a distance. Which is, you know, I'm terrified of spooky action at a distance, because it makes bugs that are oh so evil to try to debug.
So basically, if you want to share the variables, you use curly brackets.
And if you want your own subshell where you have your own copy of the variables, you use round brackets.
So curly brackets means I'm staying in the shell, I'm just grouping them together, but we still share our variables, no funny stuff.
And round brackets means give me a full subshell.
And it's very, very important you do not cuddle these brackets.
It must be open bracket, space, stuff, space, close bracket.
And since you're doing multiple commands, open bracket, new line, your commands, new line, close your bracket. And then you're always good.
The round brackets are extra pernickety. And either you have to have them on a line by themselves, or you have to put a semicolon before them, because they need to be their own statement.
So I would say, you're doing multiple commands, put them on multiple lines, tab them in so so your brain sees, oh, these are together, and then you will be fine.
[8:54] Okay, I'm going to need an example real soon, because I'm still back stuck in my question.
Oh, right now. So we have ourselves... Imagine we're going to start off with the script with no grouping. We're just going to do it without grouping, and then you're going to see why it's messy. So this is a very silly script that takes a variable named dessert spelled properly for the first time in this entire series. Because it has two S's because you want two desserts, So we start off and we say that our dessert is equal to pancakes. We echo out the dessert is $dessert, so there's no redirect there, right? That's just going to standard out as normal. Then we echo initial dessert and then the value of the variable dessert and we redirect that to log.txt with one angle bracket. So we're saying empty the file and start fresh.
Then we change the value of dessert to waffles.
We then echo updated dessert.
We print the variable again. We use two arrows to send it to log.txt.
So we're now appending. Then we just append another line. So dessert is tasty.
Two arrows into log.txt.
And then one final echo without a redirect to say dessert is now $dessert.
There's nothing weird going on here. So shock and or horror it prints.
Dessert is pancakes. Dessert is now waffles, and that's all we see on screen.
And if we cat our log.txt, it says initial dessert pancakes, updated dessert waffles, which is tasty.
[10:23] Not exciting. Everything is perfect except the misspelled dessert one time. Ah!
Poop. Oh, wow. So close. So close.
So close. So close. So if we use command grouping, we don't have to do... So we had to do one arrow the first time to make sure that we created a file fresh, and then we had to remember to append the the second two times, otherwise the only thing in the file would be the last line. I know this because I did it wrong the first time.
So it's messy to be remembering all of those different redirects, and I messed it up, proving it's messy. So we should group them, and then we'll have a much easier time of things.
So at the bit in our code where we start writing to the log file, I start a new group with curly bracket, and then inside there, I just echo as normal, right? No redirects there.
I just say echo initial dessert, dessert equals waffles, echo updated dessert, echo dessert is tasty, spelled wrong again. I close my curly bracket, and now at the end of the group, I have a single arrow to log.txt.
[11:35] So you start a new file, but it's going to put all this stuff at once?
Correct. Because it's grouped. Okay.
Yes, exactly. That is the effect of the group. It is as if all of these things were one new special magic terminal command that does my work, and that magic terminal command's output is going to log.txt. So you're grouping multiple commands to behave like one. So yeah, you're grouping them to behave like they're one.
Now we used curly brackets, so our variables behave just like before. Pancake becomes waffles, and at the very end, even outside of our group, it's still waffles. So when we edited the dessert inside the group, that affected our whole script, not just the group.
Oh, hang on. So this one still says dessert is pancakes, dessert is now waffles. What was the subtlety you were saying there?
Well, the subtlety is there is no subtlety. So when I edited dessert in the middle of the group, I could see it's effect on the very last line of the script which is outside the group.
[12:45] Dessert is now $dessertprintedwaffles. So my code in the group affected the variable outside the group.
Okay.
Right, no weirdness with scope. I am affecting the real variables in the script inside my group command.
So if I tried to do this to avoid spooky action at a distance, I would have failed.
Right? I'm sharing the variables. variables. If I now want to do it so that I do not affect the rest of my script, I intentionally want to not be able to destroy a variable somewhere else. I intentionally want a local effect. I intentionally want to just temporarily mess with the dessert. Then I would use a round bracket instead of a curly one.
[13:37] So our script starts the same. We set dessert equal to pancakes and we echo out that dessert is pancakes. Then we start our group, and we echo out the initial dessert is $dessert, which will be pancakes. We change the value of dessert to waffles. We echo out that fact.
And then we still echo out dessert spelled wrong as tasty. We redirect it all into the file. Outside the group, we print the dessert again.
So now when we run it, we do not see dessert is pancakes, dessert is now waffles. We see dessert is pancakes, dessert is still pancakes. The inside the group did not affect outside the group.
Inside the group did not affect outside the group. But everything is inside the group.
There's nothing written... No. No. The first two lines are outside. The last line is outside. Then we open a round bracket, we have four commands inside, we close the round bracket.
[14:38] Okay, okay. So you say dessert equals pancakes, echo dessert equals dollar dessert. All right.
So that's why it's going to say dessert is pancakes. Gotcha.
Correct. And then there's the group commands where it's changing dessert into waffles, and yet dessert never changes to waffles.
It does in the log file because all of that content is being piped to log.txt. So the The next thing you see is cat lubb.txt which says initial dessert pancakes.
Updated dessert waffles. is tasty.
[15:11] Okay, so you've written a script that doesn't change things.
Precisely. So imagine that was $riffs, and I wanted to use $riffs to read a file with a really weird separator, and I didn't want to have any spooky action at a distance. I changed dessert only inside the group. I mean, a local...
Only inside the log file. The log file is the only thing that knows that dessert has changed.
But outside of the log file, it doesn't know that, because the log file was created by the group.
Okay, but all commands between the opening round bracket and the closing round bracket share that variable. That variable exists between...
[15:56] It's the commands that know. The log file is just a place it ends up written to.
Which is what I just said. I said, the log file knows that the dollar dessert has changed to waffles.
It has that knowledge. Written to it, I would say. It's a receptacle. Okay.
It's a distinction without a difference. I don't think that really matters, does it?
You're making a face that that distinctly matters. If I had been the student, I would have been completely lost by that statement.
I would have thought, what do you mean the log file knows? No, no, the log file is just receiving information.
In the log file, dessert is waffles.
Correct. Can I say that? Am I allowed to say that? Okay. Absolutely, yes.
Okay. I don't know why that doesn't sound different to me. But okay, so inside the log file, we have changed it because the grouped commands wrote to the log file.
But at the terminal level or whatever, somewhere else, the dessert is still pancakes. Peace.
Okay, let's back up. Where is dessert pancakes? Right. Start of the script, dessert becomes equal to pancakes.
So dessert is pancakes when we start our script.
We print that out to standard out so that we can see that dessert is pancakes, right?
So the first line of the output is dessert is pancakes.
[17:17] We start our group. So inside the group, we have a copy of every variable, but it is a copy.
So when we first, inside the group, echo initial dessert is $dessert, it is seeing our copy of the outside variable.
So it's still pancakes, because a copy of pancakes is pancakes.
We then, inside the group, change the variable dessert to become equal to waffles.
[17:47] So now, inside the group, its copy of dessert has been updated.
The group, it now echoes updated dessert $dessert. So inside the group, it's echoing waffles.
Then it echoes dessert is tasty. Then the group ends. And we're saying that everything in that group, when it was echoing, all went into log.txt.
The group is now finished. The copy of dessert has evaporated. We have lost our waffles.
They don't exist anymore. They're in... Wherever the waffles go to die, right? They're gone.
I know, I wanted waffles. And then the next line of our script, we check our variable, and it is exactly where we left at pancakes because the group never saw our variable, They got a cop. I got it.
[18:34] I got you now. Okay." Which is very different behavior to with the curly brackets, where it was the same. There was one dessert, it wasn't copied, so when the commands inside the group messed with it, they were messing with our dessert.
So in both cases, with the curly brackets and with the parentheses, you put the parentheses on their own lines so that you could see them. For some reason, you cuddled the greater than symbol to send the output to the log.txt file.
You could have put that on another line, couldn't you? I couldn't put it on another line.
I think I would have got away with the space there, because if you think about it, it's exactly...
The curly brackets are as if I had type one command. So if you were doing an echo, you would put the arrow where you want to echo to on the same line. Yeah, yeah.
Okay, I just expected to see it uncuddled on that side too, but cuddling you're talking about on the inside of the brackets.
I am talking about on the inside, yes, I guess, yes. And whenever I say cuddling with brackets today, I do generally mean inside.
Yeah, yeah, I don't know, it's a weird phrase. Cuddling is outside too, right? OK.
[19:43] Yeah, I'll have to be more specific is what I'll have to be, because yes, it is different.
So in this case, it's on the inside, they can't be cuddled.
So where would the semicolon go that you talked about?
If you needed to do it all on one line, if you had one line with the roundy brackets, wanted to do all of that on one long, long, long line, you would need to put a semicolon just before the closing roundy bracket to make it be by itself.
[20:08] Huh, that sounds ugly, I don't wanna do it. I wanna make it nice and spaced out the way you drew it.
Yeah, I mean, if you're doing it on the command line, as in actually typing into a terminal, I can see myself doing it on one line, but in an actual script file, hecka no.
Like, indentation is my friend. I want to see the indentation, it is very important to me. I'm a fan.
So, yes, so that is command grouping.
Now, the next thing I want to talk about is a really important, it's something we know but we haven't gone into enough detail on.
So, we all know that if you type...
ORM star.txt, it will look for every text file in the current folder and delete every text file in the current folder.
[20:59] That is expanding star.txt into myfile.txt, myotherfile.txt, myotherfile.txt.
So that is an example of what bash calls an expansion.
There are things we type on the terminal that transform into something else before the command and executes. If I say echo $user, echo doesn't print $USOR, it prints the value of the variable.
So it's gone and fetched the value, and expanded the name of the variable into its value. If, I ask it to echo a bunch of math, it goes and works it out, and then echoes out the result of working it out. So these are what are called expansions in the bash manual.
When you read the bash manual, they're all called expansions. And depending on how you you count, there are seven or eight of them. I would argue there are seven of them, and then they have a friend. Now, the bit that I didn't fully grok until I started writing these show notes is that the order is A, not random, and B, has effects. And those effects explain why I'm… Can I pause you before you start into that?
Yeah. I'm sorry, I hate to do this, but I don't know when to tell you. You said, if you say Echodollaruser? Echodollaruser spelled USR or USER doesn't do anything.
Did you mean Echotilda? Which is in... Ah, because in your show notes that's not the example used.
[22:28] I do use $USERALL caps at some points in the show notes, which is why I know it works.
Okay. Okay. Got you. Sorry.
No, that's fine. It's a good check. Because it wouldn't be the first time I got it wrong, let's be honest.
You actually used two different examples, neither of which was that one.
So I'm following the show notes.
Cool. So there are actually, like I said, seven expansions and then their friend.
And they happen in this order.
The first thing that happens is something really cool I've never told you about called brace expansion. So brace yourself for that. Sorry. Then we get tilde expansion, which we've talked about. So the tilde symbol becomes the path to a home directory. So tilde without anything becomes your home directory. Tilde with a username becomes that username's home directory. So that's tilde expansion.
We have shell parameter expansion, which is the really horrible way Bash manual describes make a variable be its value. So $user is shell parameter expansion.
And the official syntax is actually $open curly bracket name a variable close curly bracket.
But if the variable is not an array, and has no special characters, you can leave out the brackets.
Hence $user works. $user is the exception. It's the shortcut.
The real syntax is $open curly bracket name of variable close curly bracket.
[23:58] And it's friends with the array, dollar name of array, open square bracket, pound symbol, close square bracket to get the number of elements in the array, square bracket zero to get the first element in the array, square bracket at to get all the elements in the array. All of those, they are all part of shell parameter expansion. You take a variable and you expand it into its value or values.
I think you have a squirrelly bracket missing when you give the example of $...
Do... Okay, I'll fix it.
Sorry, I did... No, I'll fix it too, but it doesn't matter, they'll merge together.
Okay.
After it's done the parameter expansion, the next thing it does is command substitution.
So you know that if you type $, open round bracket, and some other command, the output of that command gets put into the string you're making.
[24:54] That is called command substitution. The next thing it does is look for $2 roundy brackets, so $(), and that's for a mathematical or an arithmetic expansion.
So if you want to do some math in the middle of your string, it's $(), your math, close round bracket, close round bracket, phew.
Then... I like that one, though. That was my favorite one, when you showed us the hard way and then went, yeah, just use double round brackets.
Oh, man, come on. Come on, why didn't you tell us that before?
The next, okay, so a lot of stuff happening here.
The next thing that happens is word splitting. So it goes and looks for all of the expansions it's done, and if it finds inside the expansions any spaces you didn't quote, it breaks them into separate arguments.
Right.
[25:45] Which is why when you use the $nameofarray() at $closeNameOfArray and it prints them out with spaces, they break into separate arguments by word splitting.
Now, word splitting by default is done on any blank space.
[26:02] If you set the value of $ifs, word splitting will obey $ifs, so you can make your terminal split on something that's not a space if you mess with $ifs.
I don't recommend doing this but I'll just point out it's a thing.
Now here's the interesting thing.
Word splitting happens BEFORE file expansion. So if you type star.txt and you're the kind of person like me who puts spaces in their filenames, they don't break into separate arguments.
Even though spaces elsewhere do break into separate arguments.
The reason is because file expansion is AFTER word splitting.
This confused the bejeebus out of me for so long.
So I know you don't want me to stop you again in trying to get through all seven of these, but I don't understand what you mean by happens before.
What do you mean by, it seems to me these are all separate things.
How are they sequential?
So you can type a terminal command that does all of these things in the one massive big terminal command.
You can have a terminal command that takes seven arguments and does each of these things at least once.
When bash is trying to figure out what arguments to pass to your command, it takes your text and it applies the rules in this order to figure out the arguments for your command.
So if you say echo... It's not gonna do it in the order I tell it to do it.
[27:25] Okay, so you have typed one... We've got a lag, everyone.
That's why we keep talking over each other. We're not being rude today.
So imagine you're typing a complicated terminal command, right?
It's, you know, echo and then 10 arguments, right? As far as bash is concerned, that's one command.
And it has to figure out the meaning of that command before it executes it.
And the way it figures out how to turn your text into the actual arguments for the command is by reading what you've written, and then applying these rules.
First braces are expanded, then tildes are expanded, then variable names are expanded, then commands are run, then math is done, then we split the words, then we go look for the files.
[28:15] The last thing we do is any of the quotes that you put there get taken away. Any quotes that are in the result of the expansions are not taken away, but any quote you put there is taken away.
Let me see if I can come up with an analogy that might make sense to me and maybe to nobody who cares about high school algebra, but is this like addition gets done before multiplication?
No, multiplication gets done before addition. It's exactly like that, whichever one of those ways is correct.
Right, right. Okay. But I mean, I'm picturing that I have a command that says I want to expand a filename, star.txt, And then after I do that, I want it to do word splitting.
[29:02] That I can tell it to, by my commands, I should be able to?
No, that's not how the terminal works. So if you type doc open curly bracket one dot dot three close curly bracket star dot txt, the doc one dot dot three will get exploded first. That was a terrible example to pick because we didn't do brace expansion yet. Let me do a new example on the fly.
Okay, you type $user.star. Before it can run the command, it needs to figure out both $user and star.
The order it does it in will have a really big difference.
If it doesn't do the $user first, it's going to be looking for a file called $user, which won't exist.
If it does the $user first, it's looking for a file called allison.star.
If you take one plus two multiplied by three, depending on when you do which, you get a different answer. Depending on which of these things it does first, you will get a different answer. And so the manual basically just tells you.
[30:08] This is our equivalent of BOMDAS, right? Brackets, addition, BO. It's the acronym for the brackets, for the math, BOMDAS. I just don't remember what the letters stand for. So this is great.
Like I know the colors of the rainbow, BOMDAS is, brackets is the first thing that gets done.
Okay, I'll look it up.
One nice thing about what Bart is describing, this is difficult for one because I keep stopping him, but as he's marched through these, he's got a table and each one of these has the link directly to the documentation at GNU.org that tells you more about it, and he's got a description and then the result, whether it's a string or parameter, whatever it is.
So it's much clearer when you see the show notes than I have allowed him to be.
[30:54] And the key point is, my intention is that this installment is going to be the quick reference.
Which is why I'm being very careful to list the order, and to say that brace expansion produces a string, tilde expansion produces file paths, shell parameter expansion produces a string or a list of strings, command substitution produces a string, or a…anyway, they're all in the table, because I want to be able to see them quickly.
So. You wanted your own table.
Yes. Oh yes. I was completely honest in my setup last time, this is my quick reference, you're just here for the ride.
Which has kind of been the case of this whole Bash journey really, hasn't it?
I needed to learn Bash and now yous all are, whether you like it or not, you can just not listen.
But anyway, so we have seen tilde expansion, we've seen all of these apart from brace expansion.
So let's go pay a little visit to brace expansion. So brace expansion is designed for expanding out a sequence.
So you use brace expansion to turn one argument that describes a sequence into many arguments.
[32:03] So, if you wanted to make 3 files called doc1, doc2, doc3.txt, so each of them are .txt, you could write touch space doc1.txt space doc2.txt space doc3.txt and that would make 3 files for you.
With the brace expansion, you can do that, you can explode that out with the syntaxes open a curly bracket, the start of the range, period, period, the end of the range, close the curly bracket. And the ranges can be numeric or alphabetic. So you could make doc a to c with a dot dot c, or whatever. So basically I say touch space doc open curly, 1 dot dot 3, close curly, dot txt. And that explodes into three separate arguments. Doc 1 dot txt, doc2.txt and doc3.txt. And that is the very, very first thing that Bash does, is any of those brace expansions.
[33:05] So I could, I could have, yeah. Absolutely vital, you completely cuddle these curly brackets.
Like they're just cuddled on all sides. Otherwise they will get confused for command groupings.
Oh, that's dangerous. But yeah, okay. That's why the cuddling matters. If you do or don't cuddle, Bash goes, oh, you mean that completely different thing. I'm going to use the same symbol for it.
That explains why it's in dark, bold, giant letters when you're talking about the braces cannot be cuddled versus when they need to be, right?
Yeah, absolutely. So that's brace expansion. You can tell someone who's really good at Bash by whether or not they use brace expansion. The people who bash all the time will be brace expanding all over the place. And the really good examples on Stack Overflow will use it.
They're from people who know what they're doing. So I look at it as a sign of, oh, this person's advice is worth noting. Or they've just copied the right person. So they either There ARE someone worth noting.
Why is it such an unusual thing? It seems like, hey, that's cool, I would use that.
[34:21] Because almost no one knows it exists. Oh, okay. Okay. So we're in on the secret then.
Precisely. That's why it's a good... Do you know the secret handshake? Can you do curly bracket something dot dot something else close curly bracket?
So our next friend then is the tilde expansion, which we've already mentioned. Tilde on its own means my home directory. Tilde followed by the name of a user means that user's home directory.
Another expansion will only work if you have a home directory.
[34:51] Or if the user has a home directory, or if the user exists at all.
So if you try to expand something that doesn't exist, the expansion doesn't evaporate, it just stays as it is. So if you echo tilde, you'll see your own home directory.
But if you echo tilde poop, unless you named a user on your computer.
[35:12] Poop, which I don't think most of us have, eh, kind of thing I might do, but I didn't, didn't, it will come out as tilde poop. So if an expansion can't happen because it doesn't go anywhere, it doesn't evaporate, it just stays as it is.
[35:28] Oh, that's interesting. It is interesting, yeah. I'm noting it, I'm not sure I like it or dislike it, but it's a thing. So now you know the thing.
So the next thing then is parameter expansion. And this we have met. So we know that we can take a variable name and we can expand it out. And we know that we can take an array and we can explode it out using the square bracket at and stuff. But let's remind ourselves of that now that we know the order things happen in. Because it actually does have an effect.
So I'm going to ask you to just create a little array called arr, which is going to contain three elements. The letters O-N-E, the digit 2, and then the roman numeral 3, I-I-I. Very boring but anyway, three elements, 1-2-3.
And if we use a little for loop to just print those out one per line so that we can really see what Bash thinks is and isn't a separate argument. We can see how Bash handles things.
So if we print them out inside quotation marks, but we also stick some text before the expansion and some text after the expansion.
[36:49] How does that get handled? When it's breaking the array apart into its three separate arguments, how does it deal with text before and text after the array?
[37:00] The answer is, it only breaks into an argument on the gaps. So we have pre, as in the letters P or E, dollar symbol, then we start our expansion, we say the name of our array, we say square bracket, at, close square bracket, and we end our expansion, and then we have post, post.
So you might think that will give us five arguments pre, one, two, three, post.
But it doesn't. It gives us pre, one, two, three, post.
So the variable spitting happens on the gaps inside the array.
[37:38] So I didn't expect it to be five things, but I didn't expect if it's the gap, why does Does the pre end up before the very first thing, if that's before the gap?
Okay, but inside our string, it's pre, then the $expansion. So the pre is definitely going to be the very first thing.
The question is, is it a thing by itself? So does starting an array expansion start a new argument straight away?
No. It doesn't start a new argument until it meets the first gap, you know, the first junction, the first boundary.
Let's call it a boundary within the array. So the boundaries within the array become the boundaries between the arguments. And anything you put before or after tags along with its friend.
So pre is before the dollar expansion of the array, and post is after the dollar expansion of the array. So why isn't it obvious that if pre should obviously be before the first argument, then it's obvious that post would be after the last one, by the same argument.
I don't get what this has to do with the breaks in spacing.
Because the breaks are between 1 and 2, and between 2 and 3.
[38:58] Well, I would have thought that maybe it would have broken it at the start of the array as well and given me pre as the first argument, 1 as the second argument, 2 as the third argument, 3 as the fourth argument, post as the fifth argument. I didn't know.
We hadn't explicitly said anything about it in the show notes, so I figured it out and wrote it in the show notes. there, we know. Just cleaning house. I mean, I don't know if it's a good thing or a bad thing. It's just, this is how it works. It's just a thing.
It's just a thing. In terms of word splitting, then, word splitting by default is done on the ifs... Sorry, by default is on an empty space, but you can muck with it with the ifs character. Or sorry, the s variable. We already said that. Word splitting is not done on anything you double quote.
[39:46] So, if you double quote a string, it will not get word split.
Which is why we have double quotes, right?
Which is why quoting things makes a difference, let me put it that way.
So I'm going to illustrate the point by making a string called s and giving it the value a string with spaces, which contains spaces in all the places you'd expect a sentence to contain spaces.
And it's in a single quote. Yeah, so I'm defining it in single quotes.
So basically I want to make sure that the value of S is A space STRING space WITH space blah blah blah. So I'm dead sure that's the value in there. So that's what's definitely inside S. And so now the question becomes, if I loop over $S, how is it get broken into separate arguments?
So if I say for i in $S, and I don't put any quotation marks anywhere, I just say $S, bash.
[40:41] Will do word splitting. And what will actually come out is a string with spaces as four separate arguments. It gets broken apart. If I do exactly the same for loop and I quote $s, it's one argument, a string with spaces. So that just proves that quoting works like we think.
How do you keep in your head when to double quote and single quote, Bart?
I do it one way and when it doesn't do what I want it to do, I change it to something else.
Just like, I don't know why it's not working and I just switch it.
I have no idea in all the different ways. My mnemonic is that a single quote is simpler than a double quote, so a single quote means just do exactly this, and a double quote means go work it out, and that's more complicated.
[41:31] It seems more complicated to split a string up than it does to have it just be exactly what it was. That would give me the exact opposite.
Oh, you're going to find your own mnemonic. But I mean, this isn't the only place...
Double quote means do the hard work. This isn't the only place we've talked about double quotes versus single quotes, right?
We've talked about it all throughout this series.
Well, the annoying thing is in JavaScript it makes no difference.
No, in JavaScript it makes no difference, which is really annoying to someone like me who was a Perl programmer, where it makes a huge difference, just like it doesn't bash.
So in JavaScript, the only difference is convenience.
If I need to put a double quote in my string, I single quote the whole string, and then I don't have to use backticks.
Or, sorry, backslashes.
In Bash, they mean different things.
Okay.
So maybe all along it has been this rule that you're teaching us.
And I thought it was all different examples of single versus double quotes going on, but maybe it was always this.
[42:30] So we have used the term interpolated quotes for double quotes because it means there is calculation being done. If it's a double quote, it means bash is allowed in to take the dollars and expand them out and bash is allowed to do its things. If it's a single quote, you're basically saying to bash, no, it doesn't matter if the dollar sign is in here. I don't mean do a parameter expansion, I just mean the symbol dollar. Okay, so that is word splitting. Then the.
[43:02] Next thing that gets executed is file expansion. Actually, we have a little bit more detail to do on word splitting. So we've dealt with strings. So we know that if we don't put the string inside quotes, it will break it, it will word split it on the spaces. If we do quoted it will not. So that's why we want to quote things, if we want them to stay together.
Arrays then are another kettle of fish. So let us observe what arrays do.
We're going to make an array. It contains two strings. The first one is string1 with space. And then the second string is just string2, all mushed together, no space. So we have a two element array. One of the elements has spaces, one of them doesn't. So if we loop over that array by not quoting and just saying $open curly A open square bracket at close square bracket close curly. That means give me all the values in the array and I did not quote it. And the result is we get five separate arguments. String one with space string two.
That makes sense. If I do exactly the same thing and I quote my expansion, I get String 1 with space, String 2. That makes sense.
[44:26] Great, perfect, good. We've finally got to what I understand.
Okay, so now, very, very, very late, second last thing that happens is file name expansion.
And we all know the ubiquitous star.txt means all text files.
Entirely implicit in that is that it will only return things that actually exist on your file system. So it is actually searching your file system. So a star whatever will never expand into something that doesn't actually exist as a file or folder. And like the tilde expansion, if the thing you're looking for doesn't match anything, it just comes through as itself.
So if you do a for f in star dot waffles, and I do not have any files with the file extension waffles, what comes out is the single argument star dot waffles. But if I actually find something then what will come out are the things I find.
Now what is very fun is that if you scroll up to the table with the order, the only expansion that happens after word splitting is file expansion.
Which means if our file names contain spaces, they won't get broken up.
[45:52] Which in modern operating systems, we're putting spaces in file...
File name expansion happens after word splitting. So say again, why would the file names not get a word split?
Because it's already happened. The filenames get looked up after the splitting.
Okay. So if you find a filename with spaces. Okay.
Yeah, let's see the example. So if we find a filename with spaces, we don't have to worry about them.
The bash takes care of it. So let's make three filenames which do definitely have spaces.
So we're going to say touch, spaced, space, file.
And we're going to use our expansion 1.3.txt.
Going to make spacedFile1.txt, spacedFile2.txt, and spacedFile3.txt, which are three filenames with a space in them. And because the word splitting happens before spaced star gets expanded, the word splitting will not break our filenames apart. So we can say for f in In spaced star, enable printout spacedfile1.txt, spacedfile2.txt and spacedfile3.txt.
In the file names do not get destroyed.
[47:11] Okay, okay, I think I see what you're saying. Really convenient that it happens in that order. If it happened in any other order, we'd have to be quoting our stars and stuff. It would be a mess.
Yeah, interesting. You don't have to. Very clever. Now, we know star means zero or more occurrences of any, character. We can do more than a star. Again, this is a good test for whether someone's a bash guru or not.
We also have at our disposal the question mark, which is any one occurrence of any character.
We also have character... Oh, I like that. Isn't that like it is in regex?
Isn't question mark one? It's like it is in the basic regex, not in the extended regex.
So it's like it is in grep, not like in egrep.
Okay.
[48:03] Well, that's just me. Yes.
It's good. We also have character classes. So square bracket ABC is A or B or C. We can, also use ranges. So square bracket A dash C is also A or B or C. And very powerful, and I forgot to clean up my table. Delete those two empty rows. Very powerful. We can also use what are called POSIX character classes. They are, there's a handful of them defined, they are in the, they're in the Wikipedia page linked in the show notes. And they allow you to search for really useful things like colon, punct, colon is any punctuation character.
Alpha colon is any alphanumeric character, including your letters with funny little squiggles.
Which eight dashes eight. You mean accented characters in other languages?
That's the one. That's the one, yes. Well, so I can tell what that one was written for.
Find me the file that's breaking everything because it's got punctuation in the middle.
Exactly. Exactly. Nash considered punctuation.
[49:21] You need to go check the... Yeah, I think it probably is, because I think underscore is considered a word character, but I don't think dash is.
Going by memory here. I think dash is punctuation, underscore is not.
It's considered alpha.
Oh, interesting. I think. So these are regular expressions, is what it's using there?
They're a very, very basic type of regular expression. If you think back to taming the terminal, it's the B-O-R-E regular expression syntax, which stands for basic regular expression.
Okay, just your link in the show notes just takes us to regular expression, so that's why I was asking.
It should jump you down to the section on character classes, which has a table.
[50:10] It does. But it doesn't say that, it just says it's a link to the Wikipedia page on regular expressions, Which is why I thought they were regular expressions.
Which they are. B-R-E-S. And then the last thing that happens is any quotation marks you put there to tell Bash what to do, they get removed.
So when you say echo open quote myspace string close quote, what gets echoed to the screen is just myspace string.
The quotes you type to vanish.
That's called quote removal. It cleans up. Bash things up after itself. Exactly.
So, all of this has been leading into the table. So this to me is the thing I wanted more than anything else on planet Earth. It's a table where I can quickly check what all of the different brackets mean. Because, like I say, there's a lot of reuse. So, if you meet square brackets and they are cuddled, then it means it is a character class. If.
[51:16] We meet uncuddled single square brackets, they are legacy tests from the days of SH.
We said when we first met them, never ever use them in Bash, because they're terrible and they break things.
But you want us to recognize them, so if we go to accidentally get an old Stack Overflow answer to say, okay, I'm going to use double square brackets on that.
[51:42] Precisely. So they are legacy tests do not use. If we meet double square brackets, they are modern tests. And it means we use them for saying, you know, like $n-eq4 to check if n is equal to 4.
The nice thing is that inside these modern tests, we don't have to worry about quoting our variable names, because in here, in double square bracket land, quoting is just taken care of for you. So you don't have to worry about spaces and stuff in here. It's all fine. You never have to worry about quoting, which is completely different to single square brackets where quoting is very, very important, and you're always thinking about it, and it's horrible. And and that's why we don't use it anymore.
The end result of a square bracket test is an exit code. So a double square bracket will evaluate to success or failure, which is why they are almost always used in while loops, or for loops, or if statements.
Okay. I like them. Because we're trying to get to a true-false. I love them.
There's something about a truth table that makes me happy. Yeah, I'm with you. Bool, bool. Good Irish. You worked at University of Cork.
Good Irish person. Big fan.
[52:56] Variable expansion, then, is the dollar sign followed by curly brackets.
Means expand whatever's inside me to the value of that variable.
And that inside me could be a regular old variable name, or it could be an array with all its funniness, and the end result is going to be a string.
$ and a single roundy bracket means run the commands inside of me in a subshell and whatever those commands wrote to standard out, that replaces this placeholder.
So if you say echo inside double quotes, it's now $date, the date command will get executed, and the result of executing the date command will go into this, you know, that part of the string, so it will say it's now Saturday 2 September whatever.
Okay.
$ double roundy brackets means do the math inside of me, and then whatever the answer to that math is, that replaces the expansion.
So $ round round bracket 1 plus 1 closure round brackets becomes 2.
Does cuddling matter?
[54:05] It does not in these, which so basically unless I say so, there's no ambiguity.
So basically dollar, round bracket, round bracket only has one meaning, so that we don't have to worry about cuddling.
Okay, same thing with dollar, double, round bracket, I'm sorry, dollar, single, round bracket for subshell expansion also does it, you don't care about cuddling. It's not until you don't have the dollar symbol in front of it when you're doing command grouping or subshell command grouping that you care. Okay.
Exactly. So the dollar basically means bash doesn't have to worry about ambiguity.
So, now the dollar does have to be touching the bracket, right? You can't have dollar space curly. That's not the same thing. That's just anarchy.
But this table's handy. Precisely.
Yeah, I know. So, if we then meet cuddled curly brackets, they are range expansion, right? So, F1.3.zip is F1.zip, F2.zip, F3.zip.
Or as you called it in the show notes. Absolutely must be cuddled.
Or as you called it in the show notes, rage expansion. I have fixed that. Rage expansion. Oh good. Good, good.
Okay. Oh, I see what I did there. Excellent. I got it. Okay, so I talked over you a little bit. So with range expansion, they must be cuddled.
[55:19] Range expansion absolutely positively must be cuddled. Otherwise, it will assume it's command grouping. So command grouping must not be cuddled. And I've just realized that my whole big spiel about the importance of the semicolon, if you shove it on one line, that was for the curly ones, not the roundy ones. But my table tells me that.
[55:40] Anyway, if you always put them on a new line, you're never wrong.
So hang on. Always put them on a new line. You're saying if you use...
[55:50] Now I lost track of which was cuddled and uncuddling, but you just said something about If you're using the squirrelly brackets, it'll assume... Oh, okay, sorry.
Squirrelly brackets, cuddled is range expansions.
Curly brackets, not cuddled is command grouping.
And that's the one that has the semicolon, not the roundy brackets, like we said before. Yeah. Okay.
Yeah. Nobody's going to remember what we said. They're going to always use the table, so I think you'll be fine.
I also think the show notes are right, it's just my English wasn't. Yeah, yeah. Yeah.
Then if you have roundy brackets, they are subshell grouping.
And if you have an equal subshell command grouping, and if you have an equal sign, cuddled with the bracket, so equals bracket, that is array declaration.
Yes. Sorry. Yes. I keep forgetting people can't see or some people that can't because they're running or weird things like that, or cycling like me.
So equals round brackets is how we declare an array and the equals absolutely positively must be cuddled to the brackets for it to be array declaration. Otherwise, Bash will assume it's a soup shell and weird stuff will happen.
[57:01] I like this table. This will definitely come in handy. There we go. Now, I have been very careful in this installment to use all the words that Bash uses in its manual.
So for the first time, I'm going to recommend the Bash manual.
Because up until now, I have been minimizing my links to it.
I haven't not linked to it because at the end of the day, I'm the big believer in documentation.
But I haven't been stressing it for the reason that, until you know Bash, the Bash manual is close to useless.
It's organized in a way that makes sense to Bash programmers, and that frustrates and infuriates Bash newbies, because Bash is weird.
[57:48] When you think like Bash, Bash is sensible, Bash is consistent.
But if you're a programmer who's written in a traditional programming language, Bash is weird.
And the bash manual, well frankly, I struggled so much to find the right chapter to find the answer I needed because the words are weird, the way it's structured is weird.
But it's not weird, it's just different.
And now that we know what expansions are and what command groupings are, now when you read the index of the documentation, it actually makes sense.
So the documentation is now useful. So now, Bash is so dense, unless you're the author of Bash, or unless you're writing Bash every single day, you are going to need the manual, because you're going to forget the subtle detail.
What I'm hoping is that now we've learned enough that the manual will do what you need.
You'll be able to quickly and efficiently get the answer from the manual.
And that is, to me, that is being a skilled Bash programmer, is being able to find the answer quickly. remembering it because you won't.
[59:01] That's good, that's good. That's an interesting distinction that the manual doesn't help you until you understand what's in the manual. Until you already know.
Man pages are a bit like that too, right? A man page in a command you know but don't remember the detail is easy. A man page you're reading for the first time from scratch, can be difficult. Can be difficult. Right, so that brings us to the end of our series within a series. And it was very much my intention that we would now be diving headlong into the rewrite of XK Pass WD. That was what we were supposed to do after Bash.
Now, we were supposed to do that over the summer, because Bash was supposed to be a little three-week diversion. Bash was not a three-week diversion. I got sucked into it.
You got sucked into it. The community got sucked into it. We were all having a great time. I was like, well, we're doing it. You know, let's just in for a penny, in for a pound, let's do it.
It. So what should have been the summer hiatus when I should have been working on XK Pass WD has become bash fun. Cool. I liked it. But that changes my plans.
[1:00:07] I don't quite have all of the paperwork signed, sealed and delivered, but the chances are very high. They're not 100 percent, but they're close that I'm going to be booking some extended leave at the start of 2024 because I live in Europe where we have this thing called work-life balance and we have these weird schemes available to us. I can choose to make my work year short, but get paid over 12 months. It's a lower salary, but it's spread over the full 12 months. So I, we're still in discussion about exactly how many weeks, but I will be taking many weeks of holidays. Well, they're not really holidays, they're unpaid, but I don't lose my paycheck, but it won't feel like you're getting paid.
Exactly. It'll be a bit less, but because of the way tax plans work, not as much less as you might think. So anyway, I will have a large chunk of time where I do not have a day job. Well, I will have a day job. It will be rewriting XK past WD.
[1:01:05] Okay. You really mean it this time. I really mean it this time.
Okay. No getting sick. For many reasons. No having family crises.
Well, I'll do my darndest. I'll do my darndest. Actually, if I get sick, I get to take sick leave.
But you don't feel well enough to do XKPathWD work. You know, but it means I get to take my time and use it later.
So it's actually, I don't lose it. I don't lose it. You can't have two jobs and leave at once in Europe.
If you're on holiday and you get sick, you get your holiday time back because you can't have sick leave on holiday at the same time.
You can't. You can't serve double time. Anyway. Anyway.
[1:01:42] The point is, that leaves a gap in our schedule, right? I'm going to have Christmas and then I'm going to do lots of coding.
But it's September, not Christmas. What do we do in the meantime?
Well, the universe answered that question about Wednesday.
We're recording this on Saturday. I have a little project I'm not ready to share yet, but I will talk about it on this series at some stage.
A little project I've been working on, and I've run headlong into the need to process JSON data on the command line.
I have the data I need. I need to loop over it. But how do you loop over a giant big JSON string from the terminal? You need some sort of terminal command that speaks JSON, that you can tell how to process that JSON. And simultaneously, with my work head on, I need to, on the command line, process the output of a web services API that spits out, a lot of JSON. Megabytes of JSON. And I need about 1% of it.
So I need something to actually genuinely process that JSON from a bash script.
[1:02:54] Turns out there's an entire language, a querying language for JSON called jq.
[1:03:01] So it's a full language. It's actually very rich. It lets you query and reassemble.
So basically, you give it some JSON, and it understands the JSON.
So you can actually say, go into the top-level dictionary, go to the sub-element named waffles.
There's an array in there called pancakes. For each element in the array, make a new object for me that takes only the key value pairs A, B, and C.
C, put all of those together into a new array, then combine that new array into a dictionary.
And they give you back exactly the data I want at the output. Very powerful.
And you need to learn it. I know just enough. I need to fumble my way through and guess.
Okay. I sort of type some things. I hit enter. And it's like, oh, that's nearly what I want.
What if I move the square bracket? And sometimes it explodes. And sometimes I get the answer I want. And I'm just like, this language can do everything I need. I don't speak this language.
I need to speak this language. And a very wise person did a fantastic talk at MaxDoc explaining that the best way to learn something is to teach it. So here we go. JQ.
It's upsetting that it's called JQ though, because that just sounds like jQuery to me.
But what does it stand for?
[1:04:25] JSON Query. So SQL is the structured querying language for querying databases.
So JQ is the JSON querying language for querying JSON.
Okay.
[1:04:36] It's short. Well, that sounds like fun. There is also a terminal command.
Yeah, there's a terminal command to execute JQ queries, and the terminal command is called JQ.
Nice and short. That's handy. But so much in there. So much in there.
And that should keep us entertained with vacations and other such.
And we are not going to be back for four weeks, because I'm going to be gone.
So that'll give you time to learn enough to get ahead of the class.
Yeah, because I do need to be one week or two ahead of the class.
So I have some time to sprint ahead of you, but just a little bit ahead of you, and then I'll bring you all along. And the other thing between now and Christmas, I have a handful of Chit Chat Across the Pond lights I'm very keen to share with you.
OK, good. A couple of cool things I want to share.
That'll be fun. So we have plenty to keep us going. We salute Bash, we say goodbye to it, and we'll see you on the other side.
Yeah, we say goodbye to learning it. We'll be doing plenty of using it. Good.
Which is fun. Anyway, until next time, as very important, lots and lots of happy computing.
If you learn as much from Bart each week as I do, I'd like you to go over to lets-talk.ie and press one of the buttons over there to help support him.
He does 98% of the work here, I'm just the stooge that listens to him and asks the dumb questions.
If you go over to lets-talk.ie, you can support him on Patreon, you can donate via PayPal, Or you can use one of his referral links.
I really hope you'll go over and help him out.
[1:06:03] Music.