• Re: sh.txt; Shell+AWK+Sed

    From xwindows@xwindows@tilde.club to tilde.projects on Sat Jan 30 11:03:55 2021
    On Fri, 29 Jan 2021, yeti wrote:

    I think AWK is massively underrated these days.

    My qualm with AWK is its inability to select field via regular expression groups (it could select _record_ with regular expression, but not _field_). This is the reason that I would just go `sed` things much more often [1]
    than "Just AWK It!" in my daily command line uses; but I do agree that
    apart from this limitation, AWK is very powerful. So I will rephrase:

    "I think Shell+AWK+Sed combo is massively underrated these days"

    If I really had to use more powerful tool for this kind of usage,
    or when I had to come up with things on the go as one-liner,
    then I would use Perl. [2]

    But I'm aware that while Perl can be installed nearly anywhere [3],
    it is not _automatically_ installed everywhere; especially
    in embedded Unix-like systems [4]; while AWK and Sed are often available
    there out of the box.

    So recently, I have started to take effort to learn these "lost arts",
    by spending extra time to try using AWK in my workflow, for dealing with specific line sequences in the input, at least when the task only require
    none of, or just simple delimiter-based fields splitting.

    To this day, most of my AWK usage are mostly isolated; the next thing
    I'm looking forward to try is making AWK conspire with Sed in various ways,
    and maybe taking advantage of its floating-point arithmetic ability too.

    Cheers for minimal computing,
    ~xwindows

    -----

    [1] After I learned juicy details about `t`, `b`, `d`, and other arcane
    Sed instructions from GNU Sed's TexInfo manual; I now have much more
    tendency of using "Sed it to death!" battle cry when encountered
    with many ad-hoc text-based processing task.

    [2] Obligatory comic strip: https://xkcd.com/208/

    [3] Hell, I even have Perl in one of my FreeDOS installations.

    [4] The last time I have dealt with OpenWRT programming (in OpenWRT 14.x
    days), I had to explicitly install Perl package to have Perl there.
    --- Synchronet 3.18b-Linux NewsLink 1.113
  • From Dacav Doe@dacav@tilde.institute to tilde.projects on Tue Feb 2 21:14:34 2021
    I was about to answer yeti's post, only to realize that what you wrote, xwindows, covers almost 100% of my thoughts on the topic.

    On 2021-01-30, xwindows <xwindows@tilde.club> wrote:
    I think AWK is massively underrated these days.

    I quote.

    My qualm with AWK is its inability to select field via regular expression groups (it could select _record_ with regular expression, but not _field_).

    I'd like to add that GNU AWK is somewhat more capable w.r.t BSD's version. In GNU AWK, RS can be a regex, and the 'match' function allows for group capturing.

    It is a bit unfortunate, since the POSIX style AWK is the only thing that can be assumed to exist on any system, I'm afraid.

    If I really had to use more powerful tool for this kind of usage,
    or when I had to come up with things on the go as one-liner,
    then I would use Perl. [2]

    Quote!

    But I'm aware that while Perl can be installed nearly anywhere [3],
    ...
    [3] Hell, I even have Perl in one of my FreeDOS installations.

    WTF? Ok, this is a bit beyond even my Perl feticism :D
    I'm now curious about what you're running FreeDOS for, if I can ask :)

    it is not _automatically_ installed everywhere; especially
    in embedded Unix-like systems [4]; while AWK and Sed are often available there out of the box.

    And this is the big reason why I started to learn AWK, recently, as you did.
    I found it is a really valuable meta-tool for constructing log parsers.

    At $dayjob, we've got periodic on-call schedules, and it is unfortunately quite common to get alarms for false positives. I started to put together an AWK based filter that counts the occurrence of well known false positives, keeping counters and allowing me to identify *actual* relevant logs.

    (The only bummer is that not even GAWK can handle comma separated values with escaping sequences, and for some reason our logs are like that... so I need a pre-processing filter to handle that)

    To this day, most of my AWK usage are mostly isolated; the next thing
    I'm looking forward to try is making AWK conspire with Sed in various ways, and maybe taking advantage of its floating-point arithmetic ability too.

    My rule of thumb is: if I can do with sed, I just do with sed.

    And there's the classic case for
    tr -s ' ' | cut -d ' ' -f2
    which can replace awk's "{print $2}" with simpler tools.

    To me, AWK fills that gap "just before the urge to Perl".

    Cheers for minimal computing,

    And QUOTE! Long life to the UNIX!
    --- Synchronet 3.18b-Linux NewsLink 1.113