Update: when I first posted this, I got a lot of feedback pointing out that the xargs command does not work the way I outlined here. I apologize for spreading misinformation. I was passing on how the xargs command had been explained to me without double-checking the documentation. It shant happen again.
Also, the way I wrote about some of the xargs options was not written well enough to clearly indicate the example I was trying to illustrate. So this post has been edited a good deal and re-posted. If you are interested in seeing the diff between the original and this post, you can grab the diff on GitHub.
Okay, on to the post.
Familiarity with the xargs command is what separates basic shell users from those with serious "fu". xargs is a very useful utility, even if you have only the most basic knowledge. In fact, knowing just a tiny bit of grep, awk, and xargs can really save a ton of typing; especially the "oh what the heck, I'll just do it by hand" variety.
Like many of the most powerful shell utilities, xargs has a ton of different options, but if you know just the basics, that is usually good enough for most of your day-to-day needs.
In a nutshell, xargs is an alternative to fully expanded shell for loops. Given the right options, xargs can be made to exhibit for-loop-like behavior. But out of the box, it is a way to pipe a bunch of arguments to a single command in one fell swoop.
This can save a great deal of compute time if you are piping a long list of arguments to a command rather than iterating over them one at a time.
But first a warning. The example I am about to walk through is not a real-world use case. Rather, it is intended to help someone unfamiliar with xargs visualize where and how the command can be useful. There are several "dangerous" things about the example I will address after I first walk through it.
So please bear in mind as you walk through the example that it is just that - an example. My rationale for doing this is that xargs is a hard command to understand initially, so an overly contrived example is preferable to a "correct" one. Your mileage may vary.
Feel free to follow along with the example below. It is a bit silly but will illustrate common use.
mkdir -p /tmp/example
cd /tmp/example
echo Apple > apple.txt
echo Nutella > nutella.txt
echo Banana > banana.txt
touch junk{1,2,3,4,5}.txt
Here we have directory with some text files in it and a few junk files. Imagine that need to get rid of just the junk files. (This is a contrived example, but it is easy to imagine a more complicated scenario).
Note: using xargs here is, of course, way overkill. You could run a simple rm and be done with it.
Looking at the files in our example directory, it is clear that we need to remove all the "junk" files. Here is how you could use xargs to get rid of just the junk files:
ls | grep junk | xargs rm
If you are following along, if you run ls again, you should see no more junk files. Let us walk through each step of the pipeline.
The command we ran was:
ls | grep junk | xargs rm
Let us understand each part.
First, we run ls, which gives us output like this:
apple.txt junk1.txt junk3.txt junk5.txt
banana.txt junk2.txt junk4.txt nutella.txt
Easy enough, it is just a list of files, both space and newline delimited.
Next piece of the pipeline is grep junk. If we run just ls | grep junk, we get some output like this:
junk1.txt
junk2.txt
junk3.txt
junk4.txt
junk5.txt
We have narrowed the list of files down to just what we want to get rid of, the junk files.
Lastly, we pipe this list of files to "xargs rm", which removes the files.
What xargs is doing is taking each line of input and applying the supplied argument or command to the line of input. So our output above would expand to:
rm junk1.txt junk2.txt junk3.txt junk4.txt junk5.txt
That is really all there is to it - but it is very powerful when you need it.
If you would like to verify for yourself, trying running the full pipeline, but passing the -p option to xargs. This will put you in "prompt" mode and xargs will ask you to verify that you really want to run each command before it runs it. (Hit "y" to run the command).
While xargs is powerful just in terms of how we saw above, it does have some limitations (some common to most text processing tools).
For example, xargs can exhibit strange behavior if you have junk characters in your data or if you have strange line endings. This is particularly relevant to our example.
In particular, the "ls | grep junk" line is problematic. xargs on whitespace-delimited arguments by default, and grepping through filenames may introduce characters into the pipeline which will throw off the final command that xargs builds.
For a more discussion of this issue, I recommend you look at the separator problem.
Another item to be aware of is when using xargs in combination with the find command. For a performance gain, the find command allows you to specify the -print0 option to put a NUL separator in between file names. xargs can take advantage of this by specifying the -0 option. This will cause xargs to split on NUL rather than the default characters.
Some unix utilities have a limit on how many arguments can be fed on the command line at one time. By default, xargs will try to take all the arguments fed to it all at once (up to 5000) and feed them to the specified utility.
To circumvent this behavior and specify a different number, use the -n option. If we modified our example above, it would look like this:
ls | grep junk | xargs -n 2 rm
This would make xargs call rm for every 2 parameters fed to it from the previous part of the pipeline. This means that xargs would get called 3 times instead of 1 for our example.
User samvittighed on twitter pointed me to GNU parallel, an alternative to xargs. I have not taken a deep look at this utility, but it looks very useful.
Thank you for reading! xargs is one of those Unixy tools that, once you learn it, you find uses for it all the time. And it can really save time when you need to do some tedious task.
Also, I am grateful to the people who contacted me and gave constructive feedback. It is embarassing to make a public mistake, but I believe that if I have made an effort to fix it, nice people will recognize that I admitted my mistake and tried to make good on it. Hopefully you, dear reader, are one of these.
If you have questions or comments, please feel free to drop me a line on twitter.