CCATP_2023_07

Transcript

[0:00] Music.

[0:08] Well, it's that time of the week again. It's time for Chit Chat Across the Pond. This is episode number 773 for July 8th, 2023. And I'm your host, Alison Sheridan. This week, our guest is Bart Buschotts with programming by Stealth 152 Part B. We're back.
We are indeed back to finish where we left off. Only we left off in the middle of it.
Yeah, we did it weird last time. We did the hard bit at the start, the easy bit at the end, and then we left out all the meat in the middle. So let's fix that.
But I've got to tell you, I think it was a fantastic decision, because I understood the episode two weeks ago. Had we done it, had we tried to cram this in, I totally would not have understood the meat. And the way you've rewritten the meat, now this is going to be so much better. I've pre-read the show notes, and I'm telling you, I think I understand it. So we're on good footing today, I think.
Well, let's try to capture the momentum then. So I sort of asked you to put a pin in it last time, describing my sample solution, because I made two uses of a new terminal command called xargs. And I told you, put a pin in it, we'll come back to it later.
Now I hadn't realized later will be two weeks later. But anyway, put a pin in it, we'll come back to it later. So we are going to come back to it. And we're going to use the, bit we skipped over in the sample solution last time, as a worked example for xargs.
We're going to spend a little bit of time making friends with XARGs first.

[1:34] And then we're going to see this particular use of XARGs. It's a real Swiss army knife of a command, and once you know that it exists, you'll start to see it on Stack Overflow answers, and you'll start to see it everywhere. It's like until you buy an EV, you don't think there's many of them, and then you see them everywhere. Until you know about XARGs, you think it's obscure, and then you see it everywhere. XARGs are all over the place.
Interesting. Yeah, I don't remember having ever seen it, but now I'll keep my eyes peeled for it. Yeah, yeah. Yeah. So.

[2:04] The xargs command's job is about processing arguments. And so to help us actually see what xargs is doing, we need a script that's going to be very, very obvious, show us the arguments.
So we could send the stuff to echo, and echo will echo each argument one line at a time.
But if there's trailing spaces and stuff, that's not going to be obvious.
So I wanted to echo my arguments in such a way that it was spectacularly obvious this, what was going on. So I've written a little script, very imaginatively named PBS 152A ARG printer, and it just loops through the arguments the script gets and it prints them out. And then it starts it with the number of the variables, so $1, then a little arrow symbol made up of a minus sign and then one of the chevrons, and then a pipe. It's like, this is an edge, then the content of the arguments, then another pipe, and then the reverse of that. So you can see $1 is between these two pipes. And so it's really obvious.
Yeah, you've formatted it to say, this is $1. Look here!
Yeah. And the same for $2, $3, $4, $5, $6, etc. And so the idea is that it will show us leading and trailing spaces. And to prove that point to you, we can run the script.

[3:18] Blah, blah, blah, print args.sh with the string first args, which is the space between first and args, and then a string starting with a space, then second space args space, and then ending the string. And then you will see that first arg gets printed with the vertical lines touching it, and second arg very obviously has this leading and trailing space. And that's important to us for what we're going to do later.
If you're curious how the script works, it's quite short. What I do in the script is...
Can I pause you really quickly? You can.
I'm actually having trouble figuring out, since we've got the meat in the middle sandwich we're talking about here, you don't appear to be starting it where it says introducing XARGs. You appear to have started somewhere else.
No. The only thing in introducing XARGs is the note saying we jumped to here, then I'm on butt first dot dot dot.
Boy, did you do a pull? Yeah, I did. Let me pull again just in case, but I did and I'm on.
Are you on the branch whip or are you on the branch main? Oh, hang on. Are there? Hmm.

[4:24] OK, I'm a little confused now. This is going to make for great audio.
I appear to have to see two spots where PBS152-Whip exists in my little GitKraken tree thing.
Well, you should see one on the local and one on the remote.
Yeah, but I'm saying in GitKraken, it has this nice visual tree that shows you everything.
And there's PBS 152 WIP at the top, where it says fixed typos, which is the one I did on a different computer.

[4:57] But I appear to have checked out a different one. So there's probably push or pull arrows.
Bart, they have the same name. They would do.
Okay, so it says, a local PBS152-WIP already exists, so let me reset local to here and see how this works.
Yes!
That was the problem. Okay, that's kind of interesting, because it was PBS152-WIP two weeks ago, and it's PBS152-WIP this time, it confused things a little bit and get cracking.

[5:32] Okay, a push or pull probably would have solved it. You may have lost your typo fixes there. Not sure.
No, no, no. I got them because I pushed them from the other computer. No, I'm right. I'm all caught up. I don't know whether that will be useful to the audience, but here we are.
So you are now talking about your script.
Okay, good. So it's not a long script. Most of it is comments, really. But it does one One mildly clever thing, so that, you know, arrays are indexed from zero, and arguments are indexed from one.
The first argument is $1, the second argument is $2.
So when I shove the arguments into an array, the sequence number is wrong.
So what I did was I put an empty string followed by the arguments into the array.
So now the array has a placeholder of nothing, and then the arguments.
And this is an array we're going to loop through to build the argument list.
Exactly. We're going to loop, not over the array, we're going to loop over the sequence from 1 to the length of the array. we need to print our dollar and we need to print the actual value.

[6:38] Oh, that's right. Okay. So for i in $sequence space 1 space $octosorp pound, whatever we're calling it this week.
So that's going to send me the numbers from 1 to the length of the array. So if we have two arguments, it'll send us 1, 2. If we have three, it'll send us 1, 2, 3.
And then we just echo out backslash $, which is print me the actual dollar symbol, $i.
So print me 1, 2, 3, whatever. Then print our arrow, our pipey bit.
Then we print the element of the array, that is $ curly brackets, args array, open square bracket, $i, close square bracket, close curly array. That lovely syntax for getting out a particular element in an array. And then we have the reverse of that to close it off.
So the only real magic sauce here is that empty string at the start of args array.
Just, you know, so it's easier.
I was kind of proud of myself when I looked at that. I went, oh, he needs it to be one.
Aha, perfect. I can see it. Okay, good.
So we can see that it works. So that's kind of all there is to it. So now let's move on to why...
So just to say it more explicitly, he's got, he has a command to run this little shell script and he's got first arg and then quote space second arg space unquote.
Yes. And you did that for a reason, right? Which then shows us in the output how well we can see any leading and trailing space.

[8:06] Okay, so those leading trailing spaces inside the quotes are visible, and we learned the fact that this is going to allow us to visualize. Oh, never mind, I'm skipping ahead.
Yeah, we're basically, we're going to use this to visualize how the arguments look, because XRG's job is to manipulate arguments.
So to see what XRG is doing, we're going to use XRG to send things into this script.
And then this script is going to be like a little X-ray machine to show us what's going on.
So, what is the problem to be solved, right? The terminal is this collection of little Lego bricks, and each little Lego brick does one thing and does it well.
So what is the EXARG's one thing? What is its trick?
What is its one-hit wonder? Its job is to bridge a gap, right?
So we know that we can manipulate standard input and make it go, you know, we can send send stuff to standard input using terminal plumbing, as we've colloquially called it.
We can take it from a file and use the arrows, the chevron, to put it into standard in, or we can use a pipe to take the output of one command and make it standard in to another command. So any command that takes its input from the form of standard in, we know how to send stuff to it in a pipeline.

[9:24] But what if the command that needs to come second or third or fourth, basically not first?
You have a pipeline, and there's a command that's not first, and you want the input from before, but the command does not accept standard in.
The command wants arguments.
How do you bridge the gap between, I need arguments and I have standard in?
Well, xargs crosses standard in to arguments. So it crosses the streams, is how I like to think of it.
That's how I remember this job.
A little bit slower for me on that. So, excuse me, standard in is can't be arguments?

[10:06] And if not, that's the leap I don't quite catch. — So, when you're running a command, when you write a script, you have standard din. It is a thing provided by the operating system.
And you have $1, $2, $3. They are things provided by the operating system.
They are not the same.
So if you are using your own script, you can write your own script to take stuff from standard din if you needed to.
But what if you're using a command like printf?
Printf does not accept standard in. printf needs you...
So you have printf, the first argument is your template, and the second argument is your substitution value, and the third argument is your next value, and your next value.
So... Those can't be standard in?
No, because if you look at the man page for printf, it does not read standard in.

[10:56] Okay, okay, I got you now. So basically, you have a round plug and a square hole.
It's basically the problem. Or no hole at all in that case. Or no hole at all, right?
Or the other thing, of course, is in the case of printf, if you need five values, like let's say that your template has five placeholders, then you need to give it five values.
How do you get from standard in to five different values? Right, right, right.
So xargs job is all about taking standard-in and making it be arguments. And that allows you to bridge that gap so that you can get things that are standard-in over into arguments.
And so it really is just a swap-arounder sort of a command. So the syntax is very straightforward.
The first arguments to xargs is the command you want to run. So if we're trying to use printf, then the first argument will be printf. And then any other arguments that you want to hard-code into printf, or whatever, you put those next, and then you stop. And then whatever was standard in is going to get added after those initial arguments, and then XR is going to call your command.

[12:15] Okay, so XR's first argument is the command you're going to run, like printf. It might have some initial arguments that are part of that command itself, and then it's going to reach back and get whatever XARGs created and shove it on the other side to be the rest of the arguments.
Yes, and when you say reach back, it's from standard in.

[12:38] So the rest of the args are whatever was in standard-in. Because xargs does take standard-in.
Yes, yes. That's his job in life, right? So it's like a square-to-round adapter, so its input is standard-in and its output is a command with args that it has run. Xargs is a dongle!
It's a dongle. Genuinely, it's a terminal dongle. It really is a terminal dongle.
So, to show you that in action, let's run a command. So we're going to make standard in be a very simple string.
So we're going to say echo, and then inside double quotes, so that we can capture it all nicely, we're going to have stdin underscore arg1 space, single quote, stdin space arg2 single quote, and then end our double quote.
So the string that's going to be in standard in is stdin arg1, space, and then single-quoted stdin space arg2.
So we have two arguments. Okay, and these are just names you created so that we can tell them apart when they're in the output.
That this is the one from standard in argument one and standard in argument two, but you've used an underscore in one case and a space in the other just so we can see that they both work. Yes, and you'll notice that the one with the space I have quoted.

[13:51] Right. And then we pipe that, so echo writes the standard out, the pipe turns it into standard in for xargs.
So xargs is standard in, is that string with the two arguments.
Then we have xargs space the name of our argprinter script, space initial underscore arg1, space open a single quote, initial space arg space two, close a single quote.

[14:15] Oops, sorry, that second space should not be after arg2, I changed my mind on that.
So initial space arg2.
Now. Um, so hang on, uh, why did we have to explicitly put those names in at the end?
Well, I'm showing you where the initials are, what's coming.
They're the same name as the strings that you passed. No, initial arg versus stdin arg.

[14:46] Oh, I'm sorry. Okay. So, okay. All right.
Oh, so I'm sorry. I'm sorry. I'm being dense. I just finished making you restate it.
We've got xargs, then a command, the initial args, which in this case you called initial arg1 and initial arg2, and then what's going to get shoved in afterwards is the stuff from standard-in that xargs grabbed, which is the standard-in arg1 and standard-in arg2.
Correct. And so when we run that through our script, what we see is that we get back initial arg1, initial arg2, std in arg1, std in arg2.
So that is the order they come in.
Great, great. Okay. Okay. And that is entirely equivalent to running our script with the arguments initial arg1, initial arg2, std in arg1, std in arg2, right? If you run that command without the xargs, it is a swap-a-roover. You can see it is exactly the same as running the script with the four arguments. xargs has just assembled it for us.

[15:46] Yeah, yeah. Okay. So we can see that xargs basically treat standard in as if you had typed it on the command line. So So it breaks the arguments apart on spaces, and it respects quotes to group together.
So if I didn't quote InitialArg2, then I would have lost the...
That it would have come in as an extra argument, right? Because it would have broken on the space.
But by quoting it, it does just like it does in the terminal.
Basically, I'm saying it's like the terminal, right?
Your quotes stop the spaces breaking things.

[16:24] Right, right. OK. Overcomplicating it by saying it too often.
But it's slightly cleverer than the terminal because a lot of terminal commands break stuff apart with tabs, and or with new line characters.
So as well as obeying the normal terminal rules of one or more spaces separates my argument, it also says, yeah, a tab, sure, that's a separator. New line character? Yeah, fine, I'll separate on that too. And that means that commands like ls, right, if you do an ls slash by default, ls slash just show me, list me the content of my root directory, you'll see that it gives you a multi-column view, you know, application, space, users, actually, it's applications, tab, users, tab, core, home, sbin, var, right? They're tabbed across on one line, and then there's a new line character, and then there's another set of tab, and then another new line character, depending on how many you have. So if you run that through XARGs and pipe it into our ARG printer, you will see that it has successfully separated all of those. So applications, library, system, users, volumes, bin, they're all properly separated because the tabs and the new line characters were like, sure, no problem to me, I'm Mr. Xorgs, I'm all smart. And so it breaks them all apart nicely.
And the command again, for those listening, is ls space slash, which is what he did before, and then he pipes that into Xorgs and then sends that to the shell script.
Yeah.

[17:53] — And so XARGs has transformed... — So that's good, it respects tabs and spaces and new lines.
That's repeatable, at least, to know that.
— And useful, because a lot of terminal commands spit stuff out, separated by those characters. So it's generally very useful behavior.
These things have been around since the 70s.
They've had time to become useful, and that is certainly an example.

[18:15] But, but, but, but, but. When XARGs was new, no one put spaces in file names, names, because it broke things all over the place.
And so, as long as you don't have any spaces in your filenames, EXARGs behaves really well when combined with LS.
It's not true here in the 2000s where we have spaces all over the place, because we mostly make our files in the GUI.
And in the GUI it's so much nicer to call it myspace document than it is to start camelcasing and dashing and underscoring.
So a lot of our files have spaces. XARGs will promptly see as a separator.
So I have made a dummy file inside the zip for this installment called name-with-spaces.txt.
And so if we do the ls command on the folder for the installment, and then we run that through XARGs into our printer script, we will see that we get back five arguments.
Name with spaces.txt, then the script, then the sample solution, then the script we're using to print our args.
There should only be three.

[19:27] Name with spaces should be one, not split over three. Sugar.
How do we deal with this? Now you might think that logically...
The xargs command would have a flag to say, I would like you to use tab as a separator, or I would like you to use new line as a separator. And that's half true. That's true if you're on a modern operating system that's not the Mac. Because Apple...
Wait, wait, wait. You can't call Apple not a...
Oh no, Apple is modern. The problem is that every other modern operating system, except for the Mac. If you're on something old, all bets are off. But if you're on something new, all bets are on, unless you're on a Mac, in which case you are on a modern operating system, but no, you still can't play. So if you want to be cross-platform, I'm just not going to tell you about the stuff that isn't cross-platform. And so we're going to have to figure out a different solution that is genuinely cross-platform. And the good news is there's not just a way to do this. There's in fact many ways to do this, which is very common on the terminal. I'm going to share two ways that work. I know for a fact there are more, because when I googled it there were all sorts of other answers coming up on Stack Overflow. It's just they looked more confusing and I didn't fancy explaining them. So I went with what I think are the two nicest solutions.

[20:48] So the first thing we should know is that we do get a tiny bit of control over the separator on every version of XRX. So by default it's spaces, tabs, and new lines. So basically blank things. That's the default is. If it looks blank, it's a separator. There is a minus zero, and it's not a minus capital O or lowercase o, it's a minus the digit zero. And that says the separator shall be the null character. This is a character that exists in ASCII and everywhere. That is the character that means no character. It's a bit like zero in the number system, right? Zero is a number that means no number. The null character is a character that means no character. It's used on... If you're using programming in C, that's how you end a string. You put the null character in and that tells C to stop reading memory where at the end of the string. So it is used in a lot of places. It's used at the end of files and some file systems as well.
So the null character, we can't type it because our keyboard doesn't have a key for null character.
But we do have backslash zero, which is the escape code for the null character. So if If you want to talk about the null character, it's backslash zero inside interpolated, i.e.
Double quotes.
So, like, backslash n is a newline character, backslash t is a tab, backslash zero is the null character.

[22:08] So we do have access to it. So if we say minus zero, then xargs will treat everything that's not the null character as part of one argument until it meets a null character, and then it will start the next argument, until it meets a null character, and then it will start the next argument.
So if you have things with spaces, no problem!

[22:26] The question is, how do you make LS spit out null characters?
The answer is, I'm afraid to say you don't, because LS doesn't do that.
But if you read the man page for XARGs, it actually says that XARGs is designed to work with find.
And it actually, it literally, there's a paragraph in the man page that says that the minus zero flag in XARGs is designed to work with the minus print zero flag in find. So they're like a little matching pair. They're friends with each other.
The find command is something we've covered in Taming the Terminal.
It is a command for finding files, and by default, it finds every file.
So if you say find dot, which is the current directory, it will find you every file in the current directory.
That's kind of like LS, really, isn't it? Right. So we've sort of gotten there.
So the easiest thing to do is to say find space dot minus print zero, which means find is now going to output the answers to what's in the current directory with these null characters between every file it finds.
We pipe that to xargs minus zero.
So xargs is now looking for that null character.
And then we send that off to our pretty printing script.

[23:38] And it successfully prints out our files in the format find puts them in.
So it tells us that our folder contains dot, which is the current directory, because we told it to search dot.
And it prefixes everything with the path, so it's dot slash whatever.
But I mean, that's not the end of the world, right? That they are valid file paths.
That's pretty good. Yeah. Sure. So I'm going to count that as...
And maybe even more information.
Depending on what you're doing, it might actually be better.
So I'm going to count that as a viable solution.
There is another... By the way, we have a link in the show notes to Tim in the Terminal Installment 20, where Bart does start explaining all about the find command.
Oh, I'm so happy you noticed I forgot to replace my triple X with the actual value.
No, I figure whoever finds it first should just do it. Thank you. I thought I'd done them all, but obviously I missed one.
OK, so that is a solution.
Use the find command and the find command is actually really powerful.
So if you wanted to do something like count how many lines of JavaScript I've written, you could use the find command to find every file name that ends in .js and point it at your home directory or maybe point it at your git directory where you check out your git repos, and then shove that into xargs, and then into wc-l, And then it will count only the lines in your JavaScript files.

[25:00] Interesting. That's as long as you're not using Git to look at other people's repos. Sure.
So I will now take credit for everything anyone has written in the PBS challenges for the students of Programming by Stealth.
I guess a more realistic thing would be, you know, you're working on a project and your boss says, so how big is this project? And I've had managers ask me these kind of questions.
I mean, if you were to just have to rewrite this in a month, how much work would that be?
Quick, quick, quick, you know, find dot minus name, and that case it was Perl files, you know, dot pl wc minus l. That's 20,000 lines of code, boss. Maybe two months.

[25:42] Anyway, I've been there, done that. So the find command, you can also use find, find can do things like everything I've modified in the last hour, all these kind of things.
So the find command is actually a really good way to gather file paths.
So the fact that you could use the find command with the minus print zero and then shove it into xrx minus zero is genuinely very very useful.
So that it's really nice that they work together. So that's one solution.
I'm going to count that one as tick.
But there is another more general approach you can take. So a lot of terminal commands, even if they don't support this null character nonsense, will quite happily print things out one line at a time.
So you're basically printing it out on a line per item. Which then transforms our problem.
What we need to do then is just change every new line character to be a null character.
Well, if the command itself can't do that transformation...

[26:38] The terminal is made of lego bricks that do one thing and do it well.
Is there a brick that translates one character to another? Well there is.
We've mentioned it in passing a few times. The tor command for transliterate.
The tr command takes two arguments, a character, and another character, and it takes standard in, finds the character of the first character, replaces it with the second character, and pops it out to standard out.
So it's basically, it takes input, so we're going to say tr slash n slash 0.
So it's going to take standard in, and every new line character becomes a null character, and then it shoves it out to standard out.
So you can just stick that into the middle of your pipelining, just between the LS and the XArg-0.

[27:22] And now xarg-0 is getting null characters.
So wait a minute, so we're replacing newline characters with slash zero, which is our way of telling it we mean a null character, but we had spaces, those weren't newlines.
Ah, yes, we have one more little trick to perform here.
So we're almost there.
So if we do that by default, remember that ls was showing us a few columns, and then a newline character, and a few columns, and then a newline character.
So if we did this trick now, we would get the first argument would be 3 or 4 files, and then the second argument would be 3 or 4 files, and the third argument would be 3 or 4 files. So we need to tell ls to be a little bit cleverer.
Now you know that if you do an ls-l, it'll give you a long listing, and it'll show you one line per file, but a whole bunch of glop that you don't want.
Does ls allow you to go one item per line without the glop?
Yes, it does. Ls-1 will print you out the listing one item at a time, with no glop after it.
So if you take Ls-1, put that into tr, and that into xargs, then you get the perfect outcome, which is basically namewithspaces.txt as the first argument, pbs151challenge solution as the second argument and our argprinter as the third argument.

[28:44] Oh, because we've told it XARGs minus zero, which is to only look for new line characters, don't look for spaces.
No, so the minus zero means look for the null characters, right? But remember that... Okay, let's start at the start of the pipe. Let's not start in the middle of the pipe.
Hang on, minus zero I thought meant look for the null characters.
That is exactly correct, but you said new line a second ago, so I corrected you.
Oh, okay. So xargs minus zero is saying only look for the null characters, don't look for spaces.
Correct. Only look for new line. I'm sorry, I did it again. Only look for the null character. But in this case, we've transliterated, we've replaced the new line characters with the null character.
Yes, we have. So when it looks for the null character, we get the new lines as the only way it separates no other way.
Precisely. So ls-1 dot into tr slash n slash 0 into xargs minus 0 and then over to our printer and we get the perfect output.

[29:49] So I actually had to look up the word transliterate, even though you said you've already taught it to us before. And it's such a fascinating word that needs to exist. Write or print a, letter or word using the closest corresponding letters of a different alphabet or script.
So names of one language are often transliterated into another. I thought that was fascinating that, boy, we have a complex world that we had to come up with that word.
We do, and actually the TOR command is slightly cleverer than what I've just told you. So I've given you one character to one character. If you put ABC as the first argument and 123 as the second argument, then every A will become a 1, every B will become a 2, every C will become a 3.
Oh, so we could make Leet speak. We could make Leet speak. out of regular language.
We could also very easily do a Caesar shift.

[30:42] Which is an early coding, or yeah. Yeah, so basically you say that I'm going to slide the alphabet by X. And then so an A moves five forward, a B moves five forward. So with TR you could just do that as a simple command. So A, B, C, D, E becomes, you know, F, G, E, whatever, and loop it around at the end. Yeah, so the TR command is actually quite powerful. Yeah, very useful. So we have two perfectly good ways of getting our stuff to XARGs when it has spaces in it. And I promise you there are more. A quick Google will give you many more. But these are simple to understand and they work cross-platform, which is my big thing.
So now let's loop back to that poor pin that's been sitting in our sample code for our two weeks at this stage. So we were working on printing out our very pretty multiplication table and we needed to get the lengths of things so that we could make our tables be nicely padded, that no line was longer than the other. So we needed to work out lengths.

[31:47] Particularly, I needed to get the length... there was all sorts of length ones I needed.
But the one I told you to put a pin in was PLEN, which is the length of the product, is equal to...

[31:58] And then we open up a dollar sign and we wrap our entire command in the dollar sign, because we want the answer from this giant big command to go into the variable PLEN.
Right, so the question really for us is, what's in the giant big command?
Well we start off with an echo, where we say echo and then we say $n which is a number, star symbol, $m which is a number, and we pipe that to the basic calculator bc.
So that's going to be some mathematics, right, so 4, star, 11, or whatever.
That's going to go to bc. And bc gets piped to xargs, printf, and then the string, percent, single quote, d.
So percent is start a placeholder. The single quote is, I would like you to use commas to separate thousands.
Remember we have the anti-graph comma is how I remember it.
It's like a comma that's stuck to the ceiling. And d is for I want it as digits.
So no decimal places, right? Just a whole number.
So whatever the output of the math is, is now going to printf where the first argument is this pattern we want, and the second argument printf needs is what is going to be in %d.

[33:12] Which of course has to be the output of the calculation. So therefore xargs is taking its standard in, which is going to be a number like 42 or something, and it's putting it as the second argument to printf. So printf is going to print out 42, nicely formatted.

[33:30] So without doing this, since printf can't take standard in, you would have to have stored everything we just did into a variable and then made that the argument to printf, the second argument to printf.
Correct. That's it, exactly. And then we want to say, well, how long is this?
So then we need to pipe this to wc minus c, which I'm telling you is a character count, and that is true if you stick to old-fashioned-y stuff. I think we had a little discussion on the PBS Slack, which is an amazing place, podfee.com forward slash slack.
Yes, I noticed, I asked the question, why does BART use wc-c? Because dash c is not the character count.
It's the character count in the old-fashioned sense where a character was eight bytes.
So if you stick with... Well, in this case, we're counting digits and things, So it's definitely right here.
But if you put an emoji in there, it will break horribly. If you put some sort of a strange Cyrillic character or some Hebrew character, it will break terribly because they're actually two characters.
Diacritics will break. But for the case of our math, it's fine.
And in my brain, I don't understand why why character count being eight bytes, why is that not eight?

[34:52] I mean, you have to know that a character is eight bytes and care...
Like why do we care that it's eight bytes?
Why not just use dash m, which is what word count was actually the...
Well, it's morpheme count is where dash m comes from. Because minus c for character works in my brain and minus m does not.
Falls out.

[35:13] Okay, so if you would like to be correct and precise, and have stuff not break, use "-m", instead of "-c".
Yes, which stands for a morpheme. So if you have UTF-8 or UTF-16, and you have like an A with that funny squiggle the French put under things, that's technically two code points. Well, no, the cedilla goes underneath, and that's a C. Oops. Okay, a C with the funny squiggle, the C cedilla. That's actually two different characters in UTF. It's the C and the diacritic. And together they form a single morpheme, which is a thing you as a human care about. And so the minus M counts the morphemes, which I just can't remember.
Are you okay if I put a little aside right in there or something? Sure.
If you want to... Okay. Yeah, absolutely. Yeah, great.
I was kind of proud of myself for finding that, by the way. I read the man pages. Did you notice that? Ooh. Look at me go, girl.
So at this stage, we've counted the characters in our pretty printed number.
And then we pipeline it to XARGs one more time, and we don't give it any more values than that.
We just pipe it to XARGs and let it dangle.

[36:23] What's going on there? Well, let's break this down piece by piece.
So in order to play along here, let's first just declare that n is going to be 3 and m is going to be 14.
So if you shove those two into your bash, just n equals 3 semicolon m equals 14, then you now have the numbers.
So if you then just echo out our string and pipe it to bc, you're going to see that it outputs 42.
So when I said that after we do the bc, the value of standard out is 42, you can see that's perfect.
So you could then save that to a variable called prod, and say prod becomes equal to, dollar sign open round bracket echo our string to bc and then printf our format string percent upside down comma d $prod and it will correctly print out 42.
Okay.

[37:22] No, it won't. Yes, it will, because... Sorry. Yes, it will, because we're not counting them yet.
We're just pretty printing it, and the pretty print of 42 is indeed 42.
Okay.
So, and then we should... Okay, so, yeah. Right. Everything works absolutely fine as we build up this pipeline piece by piece by piece, until we do the wc minus c.
And what the wc minus c outputs is space, space, space, space, space...
I didn't count the spaces. I think it might be eight of them. It outputs a whole line of spaces.

[37:55] And then it outputs the count which is two. So there are only two characters. And it does that for all sorts of weird reasons that I don't agree with but I'm going to go into. And if you try to do math with space space space space space space space two. If you try to do a plus or something with that in Bash, Bash says and goes, oh that's a string, so I'll concatenate it.
So when you try to do math with space space space space 2, say, you know, add 1 or something, you're going to end up with the string space space space space 2 1, not 3. So those leading spaces are really causing us trouble. And so this is where XARGs comes into play for two reasons.
So, I haven't been massively explicit about it here, but I have intentionally said it.
The separator on the terminal is not a space, it is one or more spaces.
So if you type into our little arg printer 1 2 3 4 5, and then you look at the output, you will see that all of those leading and trailing spaces are gone.

[39:17] Okay. Why is that? Because that is how the terminal works. The rule is not break on space, it's break on one or more spaces.
So the actual separator is one or more.
So 10 spaces counts as one separator.
So by just its existence, processing args trims things, right?
It trims leading and trailing space.
Space. So that's a very useful side effect if you're going to run stuff through XARGs.
It will trim your various arguments.

[39:53] Now why is the XARGs... Let me try saying this again. Okay.
Hang on, let me try saying this again. So XARGs is going to work on one or more spaces.
So no matter how many spaces you put in there, it's going to break as though that was...
It just says that whole chunk, it's just gone.
The whole chunk is gone. So like when you separate on commas, the comma's gone from your output, when you separate on one or more spaces, all the spaces are gone.
Okay. That's kinda cool. It is kinda cool. And that's why you'll see xargs used just to trim things. Because it just does it out of the box.
But now, hang on a sec Bart. You've done something else here. You've used xargs with zero arguments.
There's no command. It's just xargs.
Argument should be the command. Well, when you read the man page, if you don't give xargs a command, it assumes you meant echo. So in other words, it'll just print it out to standard out.
So it'll take whatever you gave it, remove all the spaces you didn't want, and then put it to standard out.

[40:59] That's handy. So we have a whole bunch of leading spaces we don't want.
So if you just echo space space space for space space space and pipe it to xargs, out will come a perfect 4 with no leading or trailing spaces.
So our wc-c just pipe it to xarigs and all of that glop is gone. And at the end of it all $PLEN has exactly the value we want of 2.
Because 3 times 14 is 42 which when counted is 2 characters and it just comes out 2.
That's it. Interesting.
It. So the two most common reasons to use XARGs are to take standard-in and make it be arguments. Or to just trim standard-in. Just get the leading and trailing glop off it and just give me the nice simple bare value.

[41:51] Even if you never use XARGs for the other cool stuff it can do, that's a pretty useful trick right there.
It really is. And a lot of the times when I see XARGs in Stack Overflow, it's at the very end of a pipeline, just to clean up.
It's just sitting there at the end. And whatever else you got left, just clean it up.
Yeah, just clean it up. So pipe to XARGs is very often the last thing in a pipeline.
So that is XARGs.
So let me ask you a question. How much of this did you know before you taught it?
I knew that it did what it does, but not enough to explain it.
That was a softball question because Bart sent me a note. He said, I was going to hand wavy over part of this and just really hope you didn't ask the wrong question. I would, have suggested that I didn't ask the right question.
Agreed. You know the way we talk about bad smells in software engineering? There's a bad smell in show note writing. When I find myself going, I hope Alison doesn't say, Bart, what if the file name has spaces? Because the first version of the show notes, I didn't answer that question because I wasn't sure about the, I knew that it was messy and I knew that the easy answer didn't work on a Mac. What I didn't know was what the best answer for the Mac was. I just knew that the internet was full of people saying things, and people saying things is sometimes right.
So I spent this morning with a large cup of coffee, and I now understand.

[43:21] So did it tickle your brain going, I really want to know how that works?
Yes, it did. Of course it did. And so now I'm going to be better at writing scripts that work everywhere. Because at the moment, I'd sort of been getting by with the fact that with my work hat on, I'm a Red Hat Linux person. So the fact that it only works in UX-args didn't bother me very much, but I'm just going to start doing things the cross-platform way now so that my scripts are much more universal.

[43:46] I like it, I like it. Now, the one trick I've learned from Tom Merritt when he does know a little more, he talks about, he's teaching something pretty complicated usually, and he's trying to boil it down, and he always starts with, so if you really understand this topic, you will know that I am oversimplifying this, and I don't want to hear, well, actually. So I used it this week when I did I did a, I will be publishing a thing about how to use voiceover to test applications on the Mac.
And the first thing I say is, so if you really know how to use voiceover, I understand that I am oversimplifying this and I don't want to hear, well actually, you should tell me if I was wrong, but don't tell me that I forgot something, that I skipped something, there's something not in here because I realize that.
And I also have a section where I go, so I don't really know whether you're supposed to do A or B And I just bang away at both of them until one of them works.
And I suspect that there's a better way to do that.
That was my hand wavy part. That sounds familiar.
But no, there is a difference between incomplete and wrong.
You can be correct, but incomplete.
And if you're explaining things, that's actually often the right thing to do.
Don't I go out of my way not to be wrong, but I also go out of my way to be incomplete because otherwise, everyone's head will explode and we will achieve nothing.
I will just have recreated the man page with all of the simplicity that goes with a man page.

[45:12] Right. Well, like you could have told us all the different ways to deal with these spaces, but giving us two really good ways is a lot more useful than telling us everything.
Yeah. Yeah, exactly. And I just have this image in my head running showing us of how cranky will this make Allison if I go into this detail.
Right as you were saying that, I was thinking, oh my gosh, I can't imagine how cranky I would have been last time if you had tried to explain this in the middle of that.
Yeah. To be honest, if you had said, how do spaces work, I would have said, I'll get back to you next week. I wouldn't have tried to hand wave, but I would have just said, I'll get back to you on that because I have learned. And that's something I learned in my work hat on, right? If you don't know something, the right thing to do is to say you don't know it yet. And there's no shame in saying, I don't know that yet. It's different to say to your boss, I don't know and I don't know how to find out. That's not a good answer.
I don't know, I'll get back to you is a really good answer. And it doesn't lose your respect with people. It gains your respect with people because then when you do say something, they'll They'll believe you.

[46:13] Because they know that if you don't know, you won't spoof them.
Yeah, you won't spoof them. Right, right.
I used to say the same thing about salespeople in the technical field, where my favorite salespeople for selling me a software product or a hardware product would be somebody who knew everything, knew the answer to all of my technical questions.
But my second favorite were the ones who were, I don't know, I'm going to ask somebody smarter than me and I will get back to you.
Yeah.
That was always better than the ones, because there was the third category where they were just making things up.
And it's not really what you want.
Yeah, I don't like being on sales calls with salespeople. I want to be on a sales call with the technical people.

[46:47] And you know you are when they say things like... I'll go as far as technical marketing.
Yeah, and sometimes that means two humans. They're sometimes OK.
Yeah, sometimes it means two humans, because they at least keep each other in check.
But it's always good when they say things like, yeah, this is not as good as it could be.
Then my brain goes, OK, I can believe the rest of this presentation because they are aware of the failures of this product.
And then it's much easier to make a case.
Anyway, I feel we may be going slightly off topic, but anyway, still interesting, I suppose.
But I still enjoyed it. And you already gave a plug for podfeet.com slash slack, the PBS channel, lots of fun over there.
And I want to give a shout out to Ed Howland, who is going through and taking all of Bart's examples and making sure those are published in the student group that we have created on GitHub.
If you don't know what that is, send me an email at alison at podfeet.com and we'll loop you in where you can play along with everybody else, seeing everybody's homework on GitHub.
And speaking of homework, just a reminder that you do have a challenge from last time, and I can now expand the challenge a little. So the challenge is, last time I had said to rewrite your solution for the multiplication table to use bash arithmetic instead of the bc command.

[47:55] Well, now I'm going to tell you, are there places where xargs could simplify your code?
Maybe take two or three lines of code and collapse them into one where you might have had to use variables and now you can just do it all as one pipeline. So try and make your code nicer with both arithmetic and EXARGs.

[48:11] I do like that I got to a certain point in my homework and I stopped because I got to the point where the example I had to follow had EXARGs in it and I said, I don't know what that is. So I waited. So this will be fun. It's a fun little challenge. I like it.
Excellent. Okay. Well, until next time, whenever that shall be, remember folks, happy computing.
If you learn as much from Bart each week as I do, I'd like you to go over to lets-talk.ie and press one of the buttons over there to help support him.
He does 98% of the work here, I'm just the stooge that listens to him and asks the dumb questions.
If you go over to lets-talk.ie, you can support him on Patreon, you can donate via PayPal, or you can use one of his referral links.
I really hope you'll go over and help him out.
In the meantime, you can contact me at Podfeet, or check out all of the shows we do over there over at podfeet.com. Thanks for listening and listening.

[49:07] Music.

CCATP_2023_07_08

An audio podcast where Bart Busschots is teaching the audience to program. Associated tutorial shownotes are available at https://pbs.bartificer.net.

Transcript