Add a simple markdown parser for formatting `rustc --explain`
Currently, the output of `rustc --explain foo` displays the raw markdown in a
pager. This is acceptable, but using actual formatting makes it easier to
understand.
This patch consists of three major components:
1. A markdown parser. This is an extremely simple non-backtracking recursive
implementation that requires normalization of the final token stream
2. A utility to write the token stream to an output buffer
3. Configuration within rustc_driver_impl to invoke this combination for
`--explain`. Like the current implementation, it first attempts to print to
a pager with a fallback colorized terminal, and standard print as a last
resort.
If color is disabled, or if the output does not support it, or if printing
with color fails, it will write the raw markdown (which matches current
behavior).
Pagers known to support color are: `less` (with `-r`), `bat` (aka `catbat`),
and `delta`.
The markdown parser does not support the entire markdown specification, but
should support the following with reasonable accuracy:
- Headings, including formatting
- Comments
- Code, inline and fenced block (no indented block)
- Strong, emphasis, and strikethrough formatted text
- Links, anchor, inline, and reference-style
- Horizontal rules
- Unordered and ordered list items, including formatting
This parser and writer should be reusable by other systems if ever needed.
2022-12-19 12:09:40 -06:00
|
|
|
use std::io::BufWriter;
|
|
|
|
|
use std::path::PathBuf;
|
2024-07-29 08:13:50 +10:00
|
|
|
|
Add a simple markdown parser for formatting `rustc --explain`
Currently, the output of `rustc --explain foo` displays the raw markdown in a
pager. This is acceptable, but using actual formatting makes it easier to
understand.
This patch consists of three major components:
1. A markdown parser. This is an extremely simple non-backtracking recursive
implementation that requires normalization of the final token stream
2. A utility to write the token stream to an output buffer
3. Configuration within rustc_driver_impl to invoke this combination for
`--explain`. Like the current implementation, it first attempts to print to
a pager with a fallback colorized terminal, and standard print as a last
resort.
If color is disabled, or if the output does not support it, or if printing
with color fails, it will write the raw markdown (which matches current
behavior).
Pagers known to support color are: `less` (with `-r`), `bat` (aka `catbat`),
and `delta`.
The markdown parser does not support the entire markdown specification, but
should support the following with reasonable accuracy:
- Headings, including formatting
- Comments
- Code, inline and fenced block (no indented block)
- Strong, emphasis, and strikethrough formatted text
- Links, anchor, inline, and reference-style
- Horizontal rules
- Unordered and ordered list items, including formatting
This parser and writer should be reusable by other systems if ever needed.
2022-12-19 12:09:40 -06:00
|
|
|
use termcolor::{BufferWriter, ColorChoice};
|
|
|
|
|
|
|
|
|
|
use super::*;
|
|
|
|
|
|
|
|
|
|
const INPUT: &str = include_str!("input.md");
|
2024-01-11 15:04:48 +03:00
|
|
|
const OUTPUT_PATH: &[&str] =
|
|
|
|
|
&[env!("CARGO_MANIFEST_DIR"), "src", "markdown", "tests", "output.stdout"];
|
Add a simple markdown parser for formatting `rustc --explain`
Currently, the output of `rustc --explain foo` displays the raw markdown in a
pager. This is acceptable, but using actual formatting makes it easier to
understand.
This patch consists of three major components:
1. A markdown parser. This is an extremely simple non-backtracking recursive
implementation that requires normalization of the final token stream
2. A utility to write the token stream to an output buffer
3. Configuration within rustc_driver_impl to invoke this combination for
`--explain`. Like the current implementation, it first attempts to print to
a pager with a fallback colorized terminal, and standard print as a last
resort.
If color is disabled, or if the output does not support it, or if printing
with color fails, it will write the raw markdown (which matches current
behavior).
Pagers known to support color are: `less` (with `-r`), `bat` (aka `catbat`),
and `delta`.
The markdown parser does not support the entire markdown specification, but
should support the following with reasonable accuracy:
- Headings, including formatting
- Comments
- Code, inline and fenced block (no indented block)
- Strong, emphasis, and strikethrough formatted text
- Links, anchor, inline, and reference-style
- Horizontal rules
- Unordered and ordered list items, including formatting
This parser and writer should be reusable by other systems if ever needed.
2022-12-19 12:09:40 -06:00
|
|
|
|
|
|
|
|
const TEST_WIDTH: usize = 80;
|
|
|
|
|
|
|
|
|
|
// We try to make some words long to create corner cases
|
|
|
|
|
const TXT: &str = r"Lorem ipsum dolor sit amet, consecteturadipiscingelit.
|
|
|
|
|
Fusce-id-urna-sollicitudin, pharetra nisl nec, lobortis tellus. In at
|
|
|
|
|
metus hendrerit, tincidunteratvel, ultrices turpis. Curabitur_risus_sapien,
|
|
|
|
|
porta-sed-nunc-sed, ultricesposuerelacus. Sed porttitor quis
|
|
|
|
|
dolor non venenatis. Aliquam ut. ";
|
|
|
|
|
|
|
|
|
|
const WRAPPED: &str = r"Lorem ipsum dolor sit amet, consecteturadipiscingelit. Fusce-id-urna-
|
|
|
|
|
sollicitudin, pharetra nisl nec, lobortis tellus. In at metus hendrerit,
|
|
|
|
|
tincidunteratvel, ultrices turpis. Curabitur_risus_sapien, porta-sed-nunc-sed,
|
|
|
|
|
ultricesposuerelacus. Sed porttitor quis dolor non venenatis. Aliquam ut. Lorem
|
|
|
|
|
ipsum dolor sit amet, consecteturadipiscingelit. Fusce-id-urna-
|
|
|
|
|
sollicitudin, pharetra nisl nec, lobortis tellus. In at metus hendrerit,
|
|
|
|
|
tincidunteratvel, ultrices turpis. Curabitur_risus_sapien, porta-sed-nunc-
|
|
|
|
|
sed, ultricesposuerelacus. Sed porttitor quis dolor non venenatis. Aliquam
|
|
|
|
|
ut. Sample link lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet,
|
|
|
|
|
consecteturadipiscingelit. Fusce-id-urna-sollicitudin, pharetra nisl nec,
|
|
|
|
|
lobortis tellus. In at metus hendrerit, tincidunteratvel, ultrices turpis.
|
|
|
|
|
Curabitur_risus_sapien, porta-sed-nunc-sed, ultricesposuerelacus. Sed porttitor
|
|
|
|
|
quis dolor non venenatis. Aliquam ut. ";
|
|
|
|
|
|
|
|
|
|
#[test]
|
|
|
|
|
fn test_wrapping_write() {
|
|
|
|
|
WIDTH.with(|w| w.set(TEST_WIDTH));
|
|
|
|
|
let mut buf = BufWriter::new(Vec::new());
|
2024-01-11 15:04:48 +03:00
|
|
|
let txt = TXT.replace("-\n", "-").replace("_\n", "_").replace('\n', " ").replace(" ", "");
|
Add a simple markdown parser for formatting `rustc --explain`
Currently, the output of `rustc --explain foo` displays the raw markdown in a
pager. This is acceptable, but using actual formatting makes it easier to
understand.
This patch consists of three major components:
1. A markdown parser. This is an extremely simple non-backtracking recursive
implementation that requires normalization of the final token stream
2. A utility to write the token stream to an output buffer
3. Configuration within rustc_driver_impl to invoke this combination for
`--explain`. Like the current implementation, it first attempts to print to
a pager with a fallback colorized terminal, and standard print as a last
resort.
If color is disabled, or if the output does not support it, or if printing
with color fails, it will write the raw markdown (which matches current
behavior).
Pagers known to support color are: `less` (with `-r`), `bat` (aka `catbat`),
and `delta`.
The markdown parser does not support the entire markdown specification, but
should support the following with reasonable accuracy:
- Headings, including formatting
- Comments
- Code, inline and fenced block (no indented block)
- Strong, emphasis, and strikethrough formatted text
- Links, anchor, inline, and reference-style
- Horizontal rules
- Unordered and ordered list items, including formatting
This parser and writer should be reusable by other systems if ever needed.
2022-12-19 12:09:40 -06:00
|
|
|
write_wrapping(&mut buf, &txt, 0, None).unwrap();
|
|
|
|
|
write_wrapping(&mut buf, &txt, 4, None).unwrap();
|
|
|
|
|
write_wrapping(
|
|
|
|
|
&mut buf,
|
|
|
|
|
"Sample link lorem ipsum dolor sit amet. ",
|
|
|
|
|
4,
|
|
|
|
|
Some("link-address-placeholder"),
|
|
|
|
|
)
|
|
|
|
|
.unwrap();
|
|
|
|
|
write_wrapping(&mut buf, &txt, 0, None).unwrap();
|
|
|
|
|
let out = String::from_utf8(buf.into_inner().unwrap()).unwrap();
|
|
|
|
|
let out = out
|
|
|
|
|
.replace("\x1b\\", "")
|
|
|
|
|
.replace('\x1b', "")
|
|
|
|
|
.replace("]8;;", "")
|
|
|
|
|
.replace("link-address-placeholder", "");
|
|
|
|
|
|
|
|
|
|
for line in out.lines() {
|
|
|
|
|
assert!(line.len() <= TEST_WIDTH, "line length\n'{line}'")
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
assert_eq!(out, WRAPPED);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
#[test]
|
|
|
|
|
fn test_output() {
|
|
|
|
|
// Capture `--bless` when run via ./x
|
2023-07-03 13:40:14 -04:00
|
|
|
let bless = std::env::var_os("RUSTC_BLESS").is_some_and(|v| v != "0");
|
Add a simple markdown parser for formatting `rustc --explain`
Currently, the output of `rustc --explain foo` displays the raw markdown in a
pager. This is acceptable, but using actual formatting makes it easier to
understand.
This patch consists of three major components:
1. A markdown parser. This is an extremely simple non-backtracking recursive
implementation that requires normalization of the final token stream
2. A utility to write the token stream to an output buffer
3. Configuration within rustc_driver_impl to invoke this combination for
`--explain`. Like the current implementation, it first attempts to print to
a pager with a fallback colorized terminal, and standard print as a last
resort.
If color is disabled, or if the output does not support it, or if printing
with color fails, it will write the raw markdown (which matches current
behavior).
Pagers known to support color are: `less` (with `-r`), `bat` (aka `catbat`),
and `delta`.
The markdown parser does not support the entire markdown specification, but
should support the following with reasonable accuracy:
- Headings, including formatting
- Comments
- Code, inline and fenced block (no indented block)
- Strong, emphasis, and strikethrough formatted text
- Links, anchor, inline, and reference-style
- Horizontal rules
- Unordered and ordered list items, including formatting
This parser and writer should be reusable by other systems if ever needed.
2022-12-19 12:09:40 -06:00
|
|
|
let ast = MdStream::parse_str(INPUT);
|
|
|
|
|
let bufwtr = BufferWriter::stderr(ColorChoice::Always);
|
|
|
|
|
let mut buffer = bufwtr.buffer();
|
|
|
|
|
ast.write_termcolor_buf(&mut buffer).unwrap();
|
|
|
|
|
|
|
|
|
|
let mut blessed = PathBuf::new();
|
|
|
|
|
blessed.extend(OUTPUT_PATH);
|
|
|
|
|
|
|
|
|
|
if bless {
|
|
|
|
|
std::fs::write(&blessed, buffer.into_inner()).unwrap();
|
|
|
|
|
eprintln!("blessed output at {}", blessed.display());
|
|
|
|
|
} else {
|
|
|
|
|
let output = buffer.into_inner();
|
|
|
|
|
if std::fs::read(blessed).unwrap() != output {
|
|
|
|
|
// hack: I don't know any way to write bytes to the captured stdout
|
|
|
|
|
// that cargo test uses
|
|
|
|
|
let mut out = std::io::stdout();
|
|
|
|
|
out.write_all(b"\n\nMarkdown output did not match. Expected:\n").unwrap();
|
|
|
|
|
out.write_all(&output).unwrap();
|
|
|
|
|
out.write_all(b"\n\n").unwrap();
|
|
|
|
|
panic!("markdown output mismatch");
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|