(internals)= # Internals ```{rst-class} lead Explore Wenmode's node model, parser flow, rule dispatch, root transforms, state, and renderer internals. ``` --- Wenmode is organized around a small set of data objects and dispatch points: AST nodes, parser rules, root transforms, parser state, and renderers. The AST is mdast-compatible. Core Markdown nodes use mdast-style names and fields: `root.children`, `paragraph.children`, `heading.depth`, `link.url`, `link.title`, `image.url`, `image.alt`, `code.lang`, and literal `value` fields. Extensions use the same data-object style with explicit node types such as `table`, `footnoteReference`, `math`, `ruby`, and the directive node family. ## AST nodes Node classes live in `wenmode.nodes`. They are dataclasses that describe parsed content. Rendering behavior is not stored on the nodes; renderers decide how to turn nodes into output. ```python from wenmode import Wenmode text = '# Hello' root = Wenmode().parse(text) print(root.to_ast()) ``` `Node.to_ast()` returns a plain dictionary representation, recursively converting child nodes. ```python { 'type': 'root', 'children': [ { 'type': 'heading', 'children': [{'type': 'text', 'value': 'Hello'}], 'depth': 1, } ], } ``` Nodes follow mdast-style `type` names where possible. Common node groups are: - Parent nodes, such as `root`, `paragraph`, `heading`, `blockquote`, `list`, `listItem`, table nodes, directive nodes, and formatting nodes. - Literal nodes, such as `text`, `inlineCode`, `code`, `html`, `math`, and `inlineMath`. - Leaf nodes, such as `thematicBreak`, `break`, `image`, and `footnoteReference`. Nodes are pure data objects. They do not carry HTML tag names, HTML attributes, or other renderer hints. `HTMLRenderer`, `MarkdownRenderer`, and custom renderers own output behavior. ## Parser flow `Parser.parse()` creates a fresh `BlockState`, parses block nodes into a `Root`, runs root transform preparation, resolves deferred inline parsing, runs root transforms, and returns the root node. At a high level: 1. Blank lines are skipped. 2. Block openers are matched against enabled `BlockRule` patterns. 3. If no block rule handles the line, the parser reads a paragraph. 4. Paragraph text is parsed with enabled inline rules. 5. Root transforms finalize document-wide features. `Parser.parse_iter()` follows the block parser incrementally and yields nodes as they are parsed. It rejects rule sets that require deferred inline transforms. ## Rules All rules inherit from `Rule` and have a stable `name`. Enabled rules are available as `parser.rules`, a dictionary keyed by rule name. `BlockRule` instances provide a block opener pattern and a `parse()` method. They receive the parser, current block state, and the matched opener. `ContinueRule` instances can inspect paragraph continuation lines. This is used for syntax where a paragraph can become another block, such as setext headings. `InlineRule` instances provide a regex pattern and `parse()` method. They return `(node, end_index)`. If the rule does not accept a match, it returns `(None, start_index)` so the parser can treat the marker as text. ## Root transforms Rules can attach root transforms through their `root_transforms` attribute. Transforms can: - add required helper rules, - collect document-wide definitions, - defer inline parsing until definitions are known, - update nodes after the whole tree is parsed. Reference links, footnotes, abbreviations, and heading ID generation use this mechanism. ## Parser state `BlockState` stores the current line index, nesting depth, deferred inline queues, and a per-parse `StateStore`. Built-in reference, footnote, and abbreviation rules use that store through `StateKey` objects instead of fixed fields on `BlockState`. Because a new state and store are created for every top-level parse, definitions do not leak between parser calls. Nested block parsing shares the same store, so definitions found inside block quotes, lists, directives, or footnotes remain visible to document-level transforms. `StreamBlockState` wraps a line buffer for iterable sources. It supports lookahead without forcing the entire input to be read immediately. ## Renderers Renderers inherit from `BaseRenderer`, which dispatches by `node.type`. ```python from wenmode.renderers import BaseRenderer class PlainTextRenderer(BaseRenderer): pass ``` Register handlers with `BaseRenderer.register()` in renderer subclasses. ```python from wenmode.nodes import Text from wenmode.renderers import BaseRenderer, RenderContext class UpperRenderer(BaseRenderer): pass @UpperRenderer.register('text') def render_text(renderer: UpperRenderer, node: Text, context: RenderContext) -> str: return node.value.upper() ``` If no handler is registered, `BaseRenderer` renders child nodes or a literal `value` field. `HTMLRenderer` registers explicit handlers for Wenmode's node types and falls back to the same child/value behavior for unknown nodes.