Core block rules¶
Block-level rules for CommonMark-style document structure.
AtxHeading¶
AtxHeading parses hash-prefixed ATX headings from level 1 through level 6.
# Title
Output node is Heading, and its AST is:
{
"type": "root",
"children": [
{
"type": "heading",
"children": [
{
"type": "text",
"value": "Title"
}
],
"depth": 1
}
]
}
Option example: use AtxHeading(id_transform=True) to add generated heading
IDs.
# Hello World
{
"type": "root",
"children": [
{
"type": "heading",
"data": {
"id": "hello-world"
},
"children": [
{
"type": "text",
"value": "Hello World"
}
],
"depth": 1
}
]
}
SetextHeading¶
SetextHeading parses paragraph continuations followed by === or --- as
level 1 or level 2 headings.
Title
-----
Output node is Heading, and its AST is:
{
"type": "root",
"children": [
{
"type": "heading",
"children": [
{
"type": "text",
"value": "Title"
}
],
"depth": 2
}
]
}
Option example: use SetextHeading(id_transform=True) to add generated heading
IDs.
Hello World
===========
{
"type": "root",
"children": [
{
"type": "heading",
"data": {
"id": "hello-world"
},
"children": [
{
"type": "text",
"value": "Hello World"
}
],
"depth": 1
}
]
}
ThematicBreak¶
ThematicBreak parses horizontal rules made from ---, ***, or ___.
---
Output node is ThematicBreak, and its AST is:
{
"type": "root",
"children": [
{
"type": "thematicBreak"
}
]
}
FencedCode¶
FencedCode parses fenced code blocks opened by backtick or tilde fences.
```python
print(1)
```
Output node is Code, and its AST is:
{
"type": "root",
"children": [
{
"type": "code",
"value": "print(1)\n",
"lang": "python"
}
]
}
IndentedCode¶
IndentedCode parses code blocks indented by four spaces or one tab.
print(1)
Output node is Code, and its AST is:
{
"type": "root",
"children": [
{
"type": "code",
"value": "print(1)\n"
}
]
}
HtmlBlock¶
HtmlBlock parses CommonMark HTML block starts.
<div>Hi</div>
Output node is Html, and its AST is:
{
"type": "root",
"children": [
{
"type": "html",
"value": "<div>Hi</div>\n"
}
]
}
Option example: use HtmlBlock(disallowed_tags=["script"]) to escape selected
tags during parsing.
<script>alert(1)</script>
{
"type": "root",
"children": [
{
"type": "html",
"data": {
"escaped": true
},
"value": "<script>alert(1)</script>\n"
}
]
}
Blockquote¶
Blockquote parses >-prefixed blockquote containers.
> *quote*
Output node is Blockquote, and its AST is:
{
"type": "root",
"children": [
{
"type": "blockquote",
"children": [
{
"type": "paragraph",
"children": [
{
"type": "emphasis",
"children": [
{
"type": "text",
"value": "quote"
}
]
}
]
}
]
}
]
}
List¶
List parses bullet and ordered lists.
- *item*
Output nodes are List and ListItem, and their AST is:
{
"type": "root",
"children": [
{
"type": "list",
"children": [
{
"type": "listItem",
"children": [
{
"type": "paragraph",
"children": [
{
"type": "emphasis",
"children": [
{
"type": "text",
"value": "item"
}
]
}
]
}
],
"spread": false
}
],
"ordered": false,
"spread": false
}
]
}
Option example: use List(task=True) to parse GFM task list markers.
- [x] done
- [ ] todo
{
"type": "root",
"children": [
{
"type": "list",
"children": [
{
"type": "listItem",
"children": [
{
"type": "paragraph",
"children": [
{
"type": "text",
"value": "done"
}
]
}
],
"checked": true,
"spread": false
},
{
"type": "listItem",
"children": [
{
"type": "paragraph",
"children": [
{
"type": "text",
"value": "todo"
}
]
}
],
"checked": false,
"spread": false
}
],
"ordered": false,
"spread": false
}
]
}