Although the output of coreutils are meant to be human readable
only, many scripts today use/pipe them to other commands for various
kinds of automation. This leads to brittle solutions involving
complex awk/sed/grep gymnastics that break when the output format
changes slightly. While "everything is text" philosophy has served >GNU/Unix/Linux well, structured data processing has become important in >modern computing.
I would like to propose the addition of two new optional machine
readable output streams (in addition to already present human readable >streams):
- stdout (fd 1): human readable output
- stderr (fd 2): human readable errors
- stdoutm (fd 3): machine readable output (NEW)
- stderrm (fd 4): machine readable errors (NEW)
The machine readable output format and conventions needs to be
established. JSON is the most obvious choice with battle-tested parsers
and tools, and immediately available for the scripting ecosystem. This
could be implemented incrementally, starting with "high-usage" commands
like (ls, ps, df, du) and then gradually expand coverage.
I think it is a good idea, but if the medium is JSON text, then
repeated parsing is still there.
[-- text/plain, size 0.1K, charset US-ASCII, 6 lines, quoted-printable --] [-- application/pgp-signature, name signature.asc, size 0.8K, 17, 7bit --]
and says it is unable to open them. Are you sure your articles are
standard Usenet messages?
On Sun, 20 Jul 2025, Anton Shepelev wrote:-]
[-- text/plain, size 0.1K, charset US-ASCII, 6 lines, quoted-printable -=
-][-- application/pgp-signature, name signature.asc, size 0.8K, 17, 7bit -=
=20
and says it is unable to open them. Are you sure your articles are
standard Usenet messages?
The fact that TIN correctly shown that as an "attachment" [1]
as opposed to littering its Base64 content all over the main message,
mean the this is indeed a valid netnews article-- it is just formatted according to MIME standard, rather than being old-style bare
single-part article.
What this actually mean is you are reading is a GPG/OpenPGP-signed
netnews article: inside that "attachment" is just a few-line blob
of Base64-encoded message integrity metadata. This is not a new thing,
from what I have heard, it has been in-use on USENET too.
If your newsreader does not [2] support OpenPGP-MIME [3],
then it obviously can't do anything useful with this "attachment".
In such case, you can safely ignore any of such "attachment" with `application/pgp-signature` content type-- you're not missing anything
that the poster was saying or trying to show. [4]
Regards,
xwindows
[1] Scare-quoted "attachment" because it is not; as the content typeed):
of the main message is explicitly `multipart/signed`.
A display of cryptographic signature as attachment is just
a fallback handling of in MIME-supporting newsreader
for generic `multipart/*` content type. [2]
[2] This fact is obvious, since ones which do support this standard
would not display the mysterious MIME part as "attachment",
and will instead flag this entire article as cryptographically-signed,
then proceed to show options for user to verify that signature. [5]
[3] RFC 3156: MIME Security with OpenPGP [Aug-2001]
https://www.rfc-editor.org/rfc/rfc3156.html
[4] Obligatory comic insert:
https://www.explainxkcd.com/wiki/index.php/1181:_PGP
[5] This is what displayed, when I navigated to the article you mentioned,
using Claws Mail-- which supports GnuPG integration (my emphases in r=
http://tilde.club/~xwindows/temp/2025-07-21/pgpsigned.png
https://tilde.club/~xwindows/temp/2025-07-21/pgpsigned.png
What did a "standard Usenet messages" actually meant...
What did a "standard Usenet messages" actually meant...
On Sun, 20 Jul 2025, Anton Shepelev wrote:
[-- text/plain, size 0.1K, charset US-ASCII, 6 lines, quoted-printable --] >> [-- application/pgp-signature, name signature.asc, size 0.8K, 17, 7bit --] >>
and says it is unable to open them. Are you sure your articles are
standard Usenet messages?
The fact that TIN correctly shown that as an "attachment" [1]
as opposed to littering its Base64 content all over the main message,
mean the this is indeed a valid netnews article-- it is just formatted >according to MIME standard, rather than being old-style bare
single-part article.
What this actually mean is you are reading is a GPG/OpenPGP-signed
netnews article: inside that "attachment" is just a few-line blob
of Base64-encoded message integrity metadata. This is not a new thing,
from what I have heard, it has been in-use on USENET too.
If your newsreader does not [2] support OpenPGP-MIME [3],
then it obviously can't do anything useful with this "attachment".
In such case, you can safely ignore any of such "attachment" with >`application/pgp-signature` content type-- you're not missing anything
that the poster was saying or trying to show. [4]
D. I kinda know what PGP/MIME-signed messages look like at
byte level; because I have used GPG-encrypted+signed
emails before, and have actually tried "view source" on
the result.
I think the D. part is the most important reason; because
in the past, I have also seen users (not normies, by the
way) asked the same question but in *mailing list*
context; ~ant is not alone in this one.
Please, share the URL to your poposal, that we may follow, and if needI send it to their mailing list. Here is a link to the mailing list
be, participate in its discussion.
Annada Behera <annada@tilde.green> wrote:Yes, pure text can be structured. Output of, say, ls is also structured
Although the output of coreutils are meant to be human readable
only, many scripts today use/pipe them to other commands for various
kinds of automation. This leads to brittle solutions involving
complex awk/sed/grep gymnastics that break when the output format
changes slightly. While "everything is text" philosophy has served
GNU/Unix/Linux well, structured data processing has become important
in
modern computing.
But pure text can also be structured and machine-oriented,
rather than human-oriented, such as tab- or comma-separated files,
which are /way/ simpler than JSON.
I don't understand what repeated parsing you are talking about, care to elaborate.I would like to propose the addition of two new optional machine
readable output streams (in addition to already present human readable
streams):
- stdout (fd 1): human readable output
- stderr (fd 2): human readable errors
- stdoutm (fd 3): machine readable output (NEW)
- stderrm (fd 4): machine readable errors (NEW)
The machine readable output format and conventions needs to be
established. JSON is the most obvious choice with battle-tested
parsers and tools, and immediately available for the scripting
ecosystem. This could be implemented incrementally, starting with
"high-usage" commands like (ls, ps, df, du) and then gradually expand
coverage.
I think it is a good idea, but if the medium is JSON text, then repeated >parsing is still there.
But pure text can also be structured and machine-
oriented, rather than human-oriented, such as tab- or
comma-separated files, which are /way/ simpler than
JSON.
Yes, pure text can be structured.
Output of, say, ls is also structured but we have to do
brittle parsing with AWK/Perl regex and run into a lot of
edge-cases too.
I think it is a good idea, but if the medium is JSON
text, then repeated parsing is still there.
I don't understand what repeated parsing you are talking
about, care to elaborate.
Annada Behera to Anton Shepelev:I am proposing a machine-only readable format. For instance, in `ls`
But pure text can also be structured and machine-
oriented, rather than human-oriented, such as tab- or
comma-separated files, which are /way/ simpler than
JSON.
Yes, pure text can be structured.
And I value that, as TSV/CSV are much simpler than JSON and
both human- and machine-readable. I should hate to lose
them from my toolchains. And I do fear to lose them, as the
advent of JSON will cause a gradual extinction of classical
processing.
Output of, say, ls is also structured but we have to do
brittle parsing with AWK/Perl regex and run into a lot of
edge-cases too.
I understand what you mean. In addition to robustness, JSON
brings a higher power of expression. At the expense of a
more compicated and less human-readable format.
With suitable settings, I believe one can parse ls, e.g.:Parse yes. Robust and handles edge cases. Not likely. The entire point
<https://s.tilde.club/?file=3nwf>
Interesting point. But a lot of tools in coreutils are essentiallyI think it is a good idea, but if the medium is JSON
text, then repeated parsing is still there.
I don't understand what repeated parsing you are talking
about, care to elaborate.
Consider the following toolchain, assuming JSON input and
output:
t1 | t2 | t3 | t4
tools 2..4 parse the JSON from tools 1..3. That's repeated
parsing for you, whereas a purely structural approach would
parse JSON at the input of t1, process the data internally
in its native form, and the generate JSON on output.
Otherwise, JSON is repeatedly parsed and generated. In case
of simple filtering functions, this is literall parsing of
the same JSON data.
I understand what you mean.┴ In addition to robustness,
JSON brings a higher power of expression. At the
expense of a more compicated and less human-readable
format.
I am proposing a machine-only readable format. For
instance, in `ls` if human-readablility is important, just
use 'ls' like everyone else. Human-parsibility is at
'stdout'. Fd3 is strictly meant to be machine-read, not
for human beings. 'ls 3>&1 1>/dev/null' only meant for
piping to other programs.
If you can come up with a more expressive, (potentially
binary format) with battle tested parsers like jq/fx, I am
up for it. But at this point, JSON is so universal, anyone
can look at the output and correctly guess which tool they
need to parse with.
Again, my proposal is not to replace the stdout with JSON
output, my proposal is to leave the userfacing stdout
untouched, and only when a fd-3 is open, collect the
necessary data into JSON and put it in fd-3. No
performance overhead whatsoever if fd-3 is not open.
With suitable settings, I believe one can parse ls,
e.g.:
<https://s.tilde.club/?file=3nwf>
Parse yes. Robust and handles edge cases. Not likely.
The entire point of format specifications. There is none
of ls' output.
Consider the following toolchain, assuming JSON input
and output:
t1 | t2 | t3 | t4
tools 2..4 parse the JSON from tools 1..3. That's
repeated parsing for you, whereas a purely structural
approach would parse JSON at the input of t1, process
the data internally in its native form, and the generate
JSON on output. Otherwise, JSON is repeatedly parsed
and generated. In case of simple filtering functions,
this is literall parsing of the same JSON data.
Interesting point. But a lot of tools in coreutils are
essentially helpers around the this specific human-
machine-parsibility issue.
For example, `ls | wc -l` or `ls | sort` is essentially a
doing processing on the output of the ls.
In my world, you do your processing in a complete
programming language of your choice.
For instance, you'd do something like,
ls 3>&1 1>dev>null | fx 'this.length' # for ls | wc -l
ls 3>&1 1>dev>null | fx 'this.sort()' # for ls | sort
See how fx uses Javascript to do any processing.
And honestly, I feel the repeated parsing is a little
overstated.
Unix pipelines are generally very short and JSON parsers
are extremely optimized and fast even for a computer from
1990s, without any brittleness whatsoever. Don't let
perfect be the enemy of good.
You may value the unity that currently exists, my proposalI understand what you mean.а In addition to robustness,
JSON brings a higher power of expression. At the
expense of a more compicated and less human-readable
format.
I am proposing a machine-only readable format. For
instance, in `ls` if human-readablility is important, just
use 'ls' like everyone else. Human-parsibility is at
'stdout'. Fd3 is strictly meant to be machine-read, not
for human beings. 'ls 3>&1 1>/dev/null' only meant for
piping to other programs.
Yes, I see your point: you are proposing, as it were, and
different signal band, a complete separation between human-
and machine-oriented I/O. And I am a tad worried about this
because I value the /unity/ that currently exists in Unix
tools, where the text-stream I/O is simultaneously machine-
and human-oriented, with all the attendant quirks and
problems.
I too want it that way, text JSON, precisely for the debuggingIf you can come up with a more expressive, (potentially
binary format) with battle tested parsers like jq/fx, I am
up for it. But at this point, JSON is so universal, anyone
can look at the output and correctly guess which tool they
need to parse with.
JSON is very good as it is, and please to leave it in text
form in your proposal, to keep some of the human
readability. We humans need it, if only for debugging our
tool chains.
Ok.Again, my proposal is not to replace the stdout with JSON
output, my proposal is to leave the userfacing stdout
untouched, and only when a fd-3 is open, collect the
necessary data into JSON and put it in fd-3. No
performance overhead whatsoever if fd-3 is not open.
Nor did I performace overhead among my misgivings.
I didn't have to look carefully. I just ran it on my homeWith suitable settings, I believe one can parse ls,Parse yes. Robust and handles edge cases. Not likely.
e.g.:
<https://s.tilde.club/?file=3nwf>
This may be a straw man for you, what with my little
experience on the Unix console, but may I ask to go ahead
and break my script?
Does ls output have a specification? The POSIX documentationThe entire point of format specifications. There is none
of ls' output.
How so? Is the POSIX doc ambiguous?--
https://manned.org/man/ls.1p#head11I am not saying that those tools are useless on their own
Consider the following toolchain, assuming JSON input
and output:
t1 | t2 | t3 | t4
tools 2..4 parse the JSON from tools 1..3. That's
repeated parsing for you, whereas a purely structural
approach would parse JSON at the input of t1, process
the data internally in its native form, and the generate
JSON on output. Otherwise, JSON is repeatedly parsed
and generated. In case of simple filtering functions,
this is literall parsing of the same JSON data.
Interesting point. But a lot of tools in coreutils are
essentially helpers around the this specific human-
machine-parsibility issue.
I think they are means to an end, rather than helpers around
an issue...
The -h flag evolution shows that the Unix community alreadyFor example, `ls | wc -l` or `ls | sort` is essentially a
doing processing on the output of the ls.
Yes, which is why many tools have added the -h option for
human-oriented output.
No. You don't have to write Lua/Perl/Python. Again, 'stdout'In my world, you do your processing in a complete
programming language of your choice.
Then you have to write Lua, Perl, Python, Pascal[1], or even
C, and then rely on libraries instead of shell utilities.
This is a valid, but completely different mode of operation
from using the shell and utilities, including mini-languages
such as ed(1), sed(1), awk(1), and grep(1).
For instance, you'd do something like,
ls 3>&1 1>/dev/null | fx 'this.length' # for ls | wc -l
ls 3>&1 1>/dev/null | fx 'this.sort()' # for ls | sort
See how fx uses Javascript to do any processing.
Yes. I did not know from your previous examples that `fx'
used Javascript. In my view, so huge, complicated, and
trendy a language as Javascript is hardly compatible with
the Unix Way, cf.:
<https://felipec.wordpress.com/2025/02/13/rust-not-for-linux/>
And honestly, I feel the repeated parsing is a little
overstated.
Overstated or not, it is there in Unix toolchains and inJSON parser are optimized of handling a very narrow
your proposal, and,
for aught I understand, eliminated in Powershell.And no one uses Powershell, (and similar like NuShell)
Repeated parsing is not an issue as I mentioned earlier.Unix pipelines are generally very short and JSON parsers
are extremely optimized and fast even for a computer from
1990s, without any brittleness whatsoever. Don't let
perfect be the enemy of good.
Well, I did not suggest that you modify your proposal to
avoid repeated parsing...
Sysop: | deepend |
---|---|
Location: | Calgary, Alberta |
Users: | 278 |
Nodes: | 10 (0 / 10) |
Uptime: | 11:29:11 |
Calls: | 2,357 |
Calls today: | 3 |
Files: | 4,990 |
D/L today: |
218 files (61,623K bytes) |
Messages: | 428,694 |