note_splitter.tokens module
The class definitions for all the tokens.
See the hierarchy of all the token types here: https://note-splitter.readthedocs.io/en/latest/token-hierarchy.html
Each token class has a content property. If the token is not a combination of other
tokens, that content property is a string of the original content of the raw line of
text. Otherwise, the content property is the list of subtokens. Each token class
also has a boolean class variable (not an instance variable) named HAS_PATTERN. If
HAS_PATTERN is True, the class has a corresponding regular expression in
patterns.py.
- class note_splitter.tokens.Block
Bases:
note_splitter.tokens.TokenThe ABC for tokens that are each a combination of tokens.
- __bool__()
Returns whether the token’s content is empty.
- __contains__(item: note_splitter.tokens.Token) bool
Returns whether the token’s content contains an item.
- __delitem__(index: int) None
Deletes the token at the given index.
- __getitem__(index: int) note_splitter.tokens.Token
Returns the token at the given index.
- __iter__()
Returns an iterator for the token’s content.
- __len__()
Returns the length of the token’s content.
- __setitem__(index: int, token: note_splitter.tokens.Token) None
Sets the token at the given index to the given token.
- __str__()
Returns the original content of the token’s raw text.
- append(token: note_splitter.tokens.Token) None
Appends the given token to the section.
- property content: list[typing.Any]
- insert(index: int, token: note_splitter.tokens.Token) None
Inserts the given token at the given index.
- remove(token: note_splitter.tokens.Token) None
Removes the given token from the section.
- class note_splitter.tokens.Blockquote(line: str = '')
Bases:
note_splitter.tokens.CanHaveInlineElementsA single-line quote.
- content
The content of the line of text.
- Type
str
- level
The number of spaces of indentation.
- Type
int
- HAS_PATTERN = True
- class note_splitter.tokens.BlockquoteBlock(tokens_: Optional[list[typing.Any]] = None)
Bases:
note_splitter.tokens.BlockMultiple lines of blockquotes.
- content
The consecutive blockquote tokens.
- Type
list[Blockquote]
- class note_splitter.tokens.CanHaveInlineElements(line: str = '')
Bases:
note_splitter.tokens.LineThe ABC for single-line tokens that can have inline elements.
- class note_splitter.tokens.Code(line: str = '')
Bases:
note_splitter.tokens.FencedA line of code inside a code block.
- content
The content of the line of text.
- Type
str
- class note_splitter.tokens.CodeBlock(tokens_: Optional[list[typing.Any]] = None)
Bases:
note_splitter.tokens.BlockA multi-line code block.
- content
The code block’s code fence tokens surrounding code token(s).
- language
Any text that follows the triple backticks (or tildes) on the line of the opening code fence. Surrounding whitespace characters are removed.
- Type
str
- class note_splitter.tokens.CodeFence(line: str = '')
Bases:
note_splitter.tokens.FenceThe delimiter of a multi-line code block.
- content
The content of the line of text.
- Type
str
- language
Any text that follows the triple backticks (or triple tildes). Surrounding whitespace characters are removed. This will be an empty string if there are no non-whitespace characters after the triple backticks/tildes.
- Type
str
- HAS_PATTERN = True
- class note_splitter.tokens.EmptyLine(line: str = '')
Bases:
note_splitter.tokens.LineA line with either whitespace characters or nothing.
- content
The content of the line of text.
- Type
str
- HAS_PATTERN = True
- class note_splitter.tokens.Fence
Bases:
note_splitter.tokens.LineThe ABC for tokens that block fences are made out of.
- class note_splitter.tokens.Fenced
Bases:
note_splitter.tokens.LineThe ABC for tokens that are between Fence tokens.
- class note_splitter.tokens.Footnote(line: str = '')
Bases:
note_splitter.tokens.CanHaveInlineElementsA footnote (not a footnote reference).
- content
The content of the line of text.
- Type
str
- reference
The footnote’s reference that may appear in other parts of the document.
- Type
str
- HAS_PATTERN = True
- class note_splitter.tokens.Header(line: str = '')
Bases:
note_splitter.tokens.CanHaveInlineElementsA header (i.e. a title).
- content
The content of the line of text.
- Type
str
- body
The content of the line of text not including the header symbol(s) and their following whitespace character(s).
- Type
str
- level
The header level. A header level of 1 is the largest possible header.
- Type
int
- HAS_PATTERN = True
- class note_splitter.tokens.HorizontalRule(line: str = '')
Bases:
note_splitter.tokens.LineA horizontal rule.
- content
The content of the line of text.
- Type
str
- HAS_PATTERN = True
- class note_splitter.tokens.Line(line: str = '')
Bases:
note_splitter.tokens.TokenThe ABC for tokens that take up one line of a file.
- property content: str
- class note_splitter.tokens.Math(line: str = '')
Bases:
note_splitter.tokens.FencedA line of math inside a math block.
- content
The content of the line of text.
- Type
str
- class note_splitter.tokens.MathBlock(tokens_: Optional[list[typing.Any]] = None)
Bases:
note_splitter.tokens.BlockA multi-line mathblock.
Inline mathblocks are not supported (the opening and closing math fences must be on different lines).
- class note_splitter.tokens.MathFence(line: str = '')
Bases:
note_splitter.tokens.FenceThe delimiter of a multi-line mathblock.
- content
The content of the line of text.
- Type
str
- HAS_PATTERN = True
- class note_splitter.tokens.OrderedListItem(line: str = '')
Bases:
note_splitter.tokens.TextListItem,note_splitter.tokens.CanHaveInlineElementsAn item in an ordered list.
- content
The content of the line of text.
- Type
str
- level
The number of spaces of indentation.
- Type
int
- HAS_PATTERN = True
- class note_splitter.tokens.Section(tokens_: Optional[list[typing.Any]] = None)
Bases:
note_splitter.tokens.BlockA file section starting with a token of the chosen split type.
The Splitter returns a list of Sections. Section tokens never contain section tokens, but may contain tokens of any and all other types.
- class note_splitter.tokens.Table(tokens_: Optional[list[typing.Any]] = None)
Bases:
note_splitter.tokens.BlockA table.
- content
The table’s row token(s) and possibly divider token(s).
- Type
list[Union[TableRow, TableDivider]]
- class note_splitter.tokens.TableDivider(line: str = '')
Bases:
note_splitter.tokens.TablePartThe part of a table that divides the table’s header from its body.
- content
The content of the line of text.
- Type
str
- HAS_PATTERN = True
- class note_splitter.tokens.TablePart
Bases:
note_splitter.tokens.LineThe ABC for tokens that tables are made out of.
- class note_splitter.tokens.TableRow(line: str = '')
Bases:
note_splitter.tokens.TablePartA row of a table.
- content
The content of the line of text.
- Type
str
- HAS_PATTERN = True
- class note_splitter.tokens.Task(line: str = '')
Bases:
note_splitter.tokens.TextListItem,note_splitter.tokens.CanHaveInlineElementsA to do list item that is either checked or unchecked.
- content
The content of the line of text.
- Type
str
- level
The number of spaces of indentation.
- Type
int
- is_done
Whether the task is done (whether the box is checked).
- Type
bool
- HAS_PATTERN = True
- class note_splitter.tokens.Text(line: str = '')
Bases:
note_splitter.tokens.CanHaveInlineElementsNormal text.
This class is the catch-all for individual lines of text that don’t fall into any other category.
- content
The content of the line of text.
- Type
str
- level
The number of spaces of indentation.
- Type
int
- class note_splitter.tokens.TextList(tokens_: Optional[list[typing.Any]] = None)
Bases:
note_splitter.tokens.BlockA list that is numbered, bullet-pointed, and/or checkboxed.
A single text list may have any combination of ordered list items, unordered list items, tasks, and other text lists with more indentation.
- content
The tokens that make up the list. Lists may have sublists.
- Type
list[Union[TextListItem, “TextList”]]
- level
The number of spaces of indentation of the first item in the list.
- Type
int
- class note_splitter.tokens.TextListItem
Bases:
note_splitter.tokens.LineThe ABC for text list item tokens.
- class note_splitter.tokens.Token
Bases:
abc.ABCThe abstract base class (ABC) for all tokens.
- HAS_PATTERN = False
- __str__()
Returns the original content of the token’s raw text.
- property content: Any
- class note_splitter.tokens.UnorderedListItem(line: str = '')
Bases:
note_splitter.tokens.TextListItem,note_splitter.tokens.CanHaveInlineElementsAn item in a bullet point list.
The list can have bullet points as asterisks, minuses, and/or pluses.
- content
The content of the line of text.
- Type
str
- level
The number of spaces of indentation.
- Type
int
- HAS_PATTERN = True
- note_splitter.tokens.__is_token_type(obj: Any) bool
Returns True if obj is a Token type.
- Parameters
obj (Any) – The object to test.
- note_splitter.tokens._get_indentation_level(line: str) int
Counts the spaces at the start of the line.
If there are tabs instead, each tab is counted as 4 spaces. This function assumes tabs and spaces are not mixed.
- note_splitter.tokens.get_all_token_types(tokens_module: module) list[type[note_splitter.tokens.Token]]
Gets the list of all token types.
Call the function like this:
tokens.get_all_token_types(tokens).- Parameters
tokens_module (ModuleType) – The module containing the token types. There is only one correct argument. The only reason why the argument is required is because there doesn’t seem to be any other way to automatically get the list of token types from within the file they are in.