Core block rules

Block-level rules for CommonMark-style document structure.


AtxHeading

AtxHeading parses hash-prefixed ATX headings from level 1 through level 6.

# Title

Output node is Heading, and its AST is:

{
  "type": "root",
  "children": [
    {
      "type": "heading",
      "children": [
        {
          "type": "text",
          "value": "Title"
        }
      ],
      "depth": 1
    }
  ]
}

Option example: use AtxHeading(id_transform=True) to add generated heading IDs.

# Hello World
{
  "type": "root",
  "children": [
    {
      "type": "heading",
      "data": {
        "id": "hello-world"
      },
      "children": [
        {
          "type": "text",
          "value": "Hello World"
        }
      ],
      "depth": 1
    }
  ]
}

SetextHeading

SetextHeading parses paragraph continuations followed by === or --- as level 1 or level 2 headings.

Title
-----

Output node is Heading, and its AST is:

{
  "type": "root",
  "children": [
    {
      "type": "heading",
      "children": [
        {
          "type": "text",
          "value": "Title"
        }
      ],
      "depth": 2
    }
  ]
}

Option example: use SetextHeading(id_transform=True) to add generated heading IDs.

Hello World
===========
{
  "type": "root",
  "children": [
    {
      "type": "heading",
      "data": {
        "id": "hello-world"
      },
      "children": [
        {
          "type": "text",
          "value": "Hello World"
        }
      ],
      "depth": 1
    }
  ]
}

ThematicBreak

ThematicBreak parses horizontal rules made from ---, ***, or ___.

---

Output node is ThematicBreak, and its AST is:

{
  "type": "root",
  "children": [
    {
      "type": "thematicBreak"
    }
  ]
}

FencedCode

FencedCode parses fenced code blocks opened by backtick or tilde fences.

```python
print(1)
```

Output node is Code, and its AST is:

{
  "type": "root",
  "children": [
    {
      "type": "code",
      "value": "print(1)\n",
      "lang": "python"
    }
  ]
}

IndentedCode

IndentedCode parses code blocks indented by four spaces or one tab.

    print(1)

Output node is Code, and its AST is:

{
  "type": "root",
  "children": [
    {
      "type": "code",
      "value": "print(1)\n"
    }
  ]
}

HtmlBlock

HtmlBlock parses CommonMark HTML block starts.

<div>Hi</div>

Output node is Html, and its AST is:

{
  "type": "root",
  "children": [
    {
      "type": "html",
      "value": "<div>Hi</div>\n"
    }
  ]
}

Option example: use HtmlBlock(disallowed_tags=["script"]) to escape selected tags during parsing.

<script>alert(1)</script>
{
  "type": "root",
  "children": [
    {
      "type": "html",
      "data": {
        "escaped": true
      },
      "value": "&lt;script>alert(1)&lt;/script>\n"
    }
  ]
}

Blockquote

Blockquote parses >-prefixed blockquote containers.

> *quote*

Output node is Blockquote, and its AST is:

{
  "type": "root",
  "children": [
    {
      "type": "blockquote",
      "children": [
        {
          "type": "paragraph",
          "children": [
            {
              "type": "emphasis",
              "children": [
                {
                  "type": "text",
                  "value": "quote"
                }
              ]
            }
          ]
        }
      ]
    }
  ]
}

List

List parses bullet and ordered lists.

- *item*

Output nodes are List and ListItem, and their AST is:

{
  "type": "root",
  "children": [
    {
      "type": "list",
      "children": [
        {
          "type": "listItem",
          "children": [
            {
              "type": "paragraph",
              "children": [
                {
                  "type": "emphasis",
                  "children": [
                    {
                      "type": "text",
                      "value": "item"
                    }
                  ]
                }
              ]
            }
          ],
          "spread": false
        }
      ],
      "ordered": false,
      "spread": false
    }
  ]
}

Option example: use List(task=True) to parse GFM task list markers.

- [x] done
- [ ] todo
{
  "type": "root",
  "children": [
    {
      "type": "list",
      "children": [
        {
          "type": "listItem",
          "children": [
            {
              "type": "paragraph",
              "children": [
                {
                  "type": "text",
                  "value": "done"
                }
              ]
            }
          ],
          "checked": true,
          "spread": false
        },
        {
          "type": "listItem",
          "children": [
            {
              "type": "paragraph",
              "children": [
                {
                  "type": "text",
                  "value": "todo"
                }
              ]
            }
          ],
          "checked": false,
          "spread": false
        }
      ],
      "ordered": false,
      "spread": false
    }
  ]
}