I had written an article about running scripts in parallel using GNU Parallel, and then I realized that GNU parallel isn’t in the CentOS repositories. Since the code that I’m writing requires standard repo support, I need to find a different solution.

If we want to perform the same action as in the referenced article, using xargs instead of GNU parallel, we’d run the following command.

$ echo {1..20} | xargs -n1 -P5 ./echo_sleep
1426008382 -- starting -- 2
1426008382 -- starting -- 5
1426008382 -- starting -- 1
1426008382 -- starting -- 3
1426008382 -- starting -- 4
1426008382 -- finishing -- 4
1426008382 -- starting -- 6
1426008383 -- finishing -- 1
1426008383 -- starting -- 7
1426008385 -- finishing -- 3
1426008385 -- starting -- 8
1426008386 -- finishing -- 7
1426008386 -- starting -- 9
1426008389 -- finishing -- 9
1426008389 -- starting -- 10
1426008390 -- finishing -- 2
1426008390 -- finishing -- 5
1426008390 -- starting -- 11
1426008390 -- starting -- 12
1426008391 -- finishing -- 6
1426008391 -- starting -- 13
1426008392 -- finishing -- 10
1426008392 -- starting -- 14
1426008394 -- finishing -- 8
1426008394 -- starting -- 15
1426008396 -- finishing -- 15
1426008396 -- starting -- 16
1426008397 -- finishing -- 16
1426008397 -- starting -- 17
1426008398 -- finishing -- 11
1426008398 -- starting -- 18
1426008399 -- finishing -- 12
1426008399 -- starting -- 19
1426008399 -- finishing -- 13
1426008399 -- starting -- 20
1426008399 -- finishing -- 20
1426008399 -- finishing -- 17
1426008400 -- finishing -- 14
1426008402 -- finishing -- 18
1426008408 -- finishing -- 19

Some things to note here: First, the “-n1″ or “-n 1″ option is critical, as it informs xargs how many arguments from the echo string that each instance of the script echo_sleep needs to take as input. Also, the output format controls for xargs aren’t as well developed. In fact, it’s entirely possible for stdout from invoked scripts to collide. For this reason, you may want to make sure that the invoked scripts are more advanced (in Python with better file handling, for instance) instead of simply redirecting bash output.


3 Comments

  1. Posted March 25, 2015 at 06:31 | Permalink

    Is your reason for not using GNU Parallel covered by:
    http://oletange.blogspot.dk/2013/04/why-not-install-gnu-parallel.html

  2. jason
    Posted May 27, 2015 at 12:19 | Permalink

    I prefer to use GNU parallel myself, but my work’s production environment won’t allow it as it can’t be loaded and updated from the standard CentOS repos (at least not that I can find).

  3. paul
    Posted February 21, 2016 at 16:13 | Permalink

    I’ve always used gnu parallel, or written my own orchestrators in python when I need something fancy. I didn’t even know about xargs’s -P option to use multiple processes. Thanks for the info! It’s good to know.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="" highlight="">