note_splitter.tokens module

The class definitions for all the tokens.

See the hierarchy of all the token types here: https://note-splitter.readthedocs.io/en/latest/token-hierarchy.html

Each token class has a content property. If the token is not a combination of other tokens, that content property is a string of the original content of the raw line of text. Otherwise, the content property is the list of subtokens. Each token class also has a boolean class variable (not an instance variable) named HAS_PATTERN. If HAS_PATTERN is True, the class has a corresponding regular expression in patterns.py.

class note_splitter.tokens.Block

Bases: note_splitter.tokens.Token

The ABC for tokens that are each a combination of tokens.

__bool__()

Returns whether the token’s content is empty.

__contains__(item: note_splitter.tokens.Token) bool

Returns whether the token’s content contains an item.

__delitem__(index: int) None

Deletes the token at the given index.

__getitem__(index: int) note_splitter.tokens.Token

Returns the token at the given index.

__iter__()

Returns an iterator for the token’s content.

__len__()

Returns the length of the token’s content.

__setitem__(index: int, token: note_splitter.tokens.Token) None

Sets the token at the given index to the given token.

__str__()

Returns the original content of the token’s raw text.

append(token: note_splitter.tokens.Token) None

Appends the given token to the section.

property content: list[typing.Any]
insert(index: int, token: note_splitter.tokens.Token) None

Inserts the given token at the given index.

remove(token: note_splitter.tokens.Token) None

Removes the given token from the section.

class note_splitter.tokens.Blockquote(line: str = '')

Bases: note_splitter.tokens.CanHaveInlineElements

A single-line quote.

content

The content of the line of text.

Type

str

level

The number of spaces of indentation.

Type

int

HAS_PATTERN = True
class note_splitter.tokens.BlockquoteBlock(tokens_: Optional[list[typing.Any]] = None)

Bases: note_splitter.tokens.Block

Multiple lines of blockquotes.

content

The consecutive blockquote tokens.

Type

list[Blockquote]

class note_splitter.tokens.CanHaveInlineElements(line: str = '')

Bases: note_splitter.tokens.Line

The ABC for single-line tokens that can have inline elements.

class note_splitter.tokens.Code(line: str = '')

Bases: note_splitter.tokens.Fenced

A line of code inside a code block.

content

The content of the line of text.

Type

str

class note_splitter.tokens.CodeBlock(tokens_: Optional[list[typing.Any]] = None)

Bases: note_splitter.tokens.Block

A multi-line code block.

content

The code block’s code fence tokens surrounding code token(s).

Type

list[Union[CodeFence, Code]]

language

Any text that follows the triple backticks (or tildes) on the line of the opening code fence. Surrounding whitespace characters are removed.

Type

str

class note_splitter.tokens.CodeFence(line: str = '')

Bases: note_splitter.tokens.Fence

The delimiter of a multi-line code block.

content

The content of the line of text.

Type

str

language

Any text that follows the triple backticks (or triple tildes). Surrounding whitespace characters are removed. This will be an empty string if there are no non-whitespace characters after the triple backticks/tildes.

Type

str

HAS_PATTERN = True
class note_splitter.tokens.EmptyLine(line: str = '')

Bases: note_splitter.tokens.Line

A line with either whitespace characters or nothing.

content

The content of the line of text.

Type

str

HAS_PATTERN = True
class note_splitter.tokens.Fence

Bases: note_splitter.tokens.Line

The ABC for tokens that block fences are made out of.

class note_splitter.tokens.Fenced

Bases: note_splitter.tokens.Line

The ABC for tokens that are between Fence tokens.

class note_splitter.tokens.Footnote(line: str = '')

Bases: note_splitter.tokens.CanHaveInlineElements

A footnote (not a footnote reference).

content

The content of the line of text.

Type

str

reference

The footnote’s reference that may appear in other parts of the document.

Type

str

HAS_PATTERN = True
class note_splitter.tokens.Header(line: str = '')

Bases: note_splitter.tokens.CanHaveInlineElements

A header (i.e. a title).

content

The content of the line of text.

Type

str

body

The content of the line of text not including the header symbol(s) and their following whitespace character(s).

Type

str

level

The header level. A header level of 1 is the largest possible header.

Type

int

HAS_PATTERN = True
class note_splitter.tokens.HorizontalRule(line: str = '')

Bases: note_splitter.tokens.Line

A horizontal rule.

content

The content of the line of text.

Type

str

HAS_PATTERN = True
class note_splitter.tokens.Line(line: str = '')

Bases: note_splitter.tokens.Token

The ABC for tokens that take up one line of a file.

property content: str
class note_splitter.tokens.Math(line: str = '')

Bases: note_splitter.tokens.Fenced

A line of math inside a math block.

content

The content of the line of text.

Type

str

class note_splitter.tokens.MathBlock(tokens_: Optional[list[typing.Any]] = None)

Bases: note_splitter.tokens.Block

A multi-line mathblock.

Inline mathblocks are not supported (the opening and closing math fences must be on different lines).

content

The mathblock’s math fence tokens surrounding math token(s).

Type

list[Math]

class note_splitter.tokens.MathFence(line: str = '')

Bases: note_splitter.tokens.Fence

The delimiter of a multi-line mathblock.

content

The content of the line of text.

Type

str

HAS_PATTERN = True
class note_splitter.tokens.OrderedListItem(line: str = '')

Bases: note_splitter.tokens.TextListItem, note_splitter.tokens.CanHaveInlineElements

An item in an ordered list.

content

The content of the line of text.

Type

str

level

The number of spaces of indentation.

Type

int

HAS_PATTERN = True
class note_splitter.tokens.Section(tokens_: Optional[list[typing.Any]] = None)

Bases: note_splitter.tokens.Block

A file section starting with a token of the chosen split type.

The Splitter returns a list of Sections. Section tokens never contain section tokens, but may contain tokens of any and all other types.

content

The tokens in this section, starting with a token of the chosen split type.

Type

list[Token]

class note_splitter.tokens.Table(tokens_: Optional[list[typing.Any]] = None)

Bases: note_splitter.tokens.Block

A table.

content

The table’s row token(s) and possibly divider token(s).

Type

list[Union[TableRow, TableDivider]]

class note_splitter.tokens.TableDivider(line: str = '')

Bases: note_splitter.tokens.TablePart

The part of a table that divides the table’s header from its body.

content

The content of the line of text.

Type

str

HAS_PATTERN = True
class note_splitter.tokens.TablePart

Bases: note_splitter.tokens.Line

The ABC for tokens that tables are made out of.

class note_splitter.tokens.TableRow(line: str = '')

Bases: note_splitter.tokens.TablePart

A row of a table.

content

The content of the line of text.

Type

str

HAS_PATTERN = True
class note_splitter.tokens.Task(line: str = '')

Bases: note_splitter.tokens.TextListItem, note_splitter.tokens.CanHaveInlineElements

A to do list item that is either checked or unchecked.

content

The content of the line of text.

Type

str

level

The number of spaces of indentation.

Type

int

is_done

Whether the task is done (whether the box is checked).

Type

bool

HAS_PATTERN = True
class note_splitter.tokens.Text(line: str = '')

Bases: note_splitter.tokens.CanHaveInlineElements

Normal text.

This class is the catch-all for individual lines of text that don’t fall into any other category.

content

The content of the line of text.

Type

str

level

The number of spaces of indentation.

Type

int

class note_splitter.tokens.TextList(tokens_: Optional[list[typing.Any]] = None)

Bases: note_splitter.tokens.Block

A list that is numbered, bullet-pointed, and/or checkboxed.

A single text list may have any combination of ordered list items, unordered list items, tasks, and other text lists with more indentation.

content

The tokens that make up the list. Lists may have sublists.

Type

list[Union[TextListItem, “TextList”]]

level

The number of spaces of indentation of the first item in the list.

Type

int

class note_splitter.tokens.TextListItem

Bases: note_splitter.tokens.Line

The ABC for text list item tokens.

class note_splitter.tokens.Token

Bases: abc.ABC

The abstract base class (ABC) for all tokens.

HAS_PATTERN = False
__str__()

Returns the original content of the token’s raw text.

property content: Any
class note_splitter.tokens.UnorderedListItem(line: str = '')

Bases: note_splitter.tokens.TextListItem, note_splitter.tokens.CanHaveInlineElements

An item in a bullet point list.

The list can have bullet points as asterisks, minuses, and/or pluses.

content

The content of the line of text.

Type

str

level

The number of spaces of indentation.

Type

int

HAS_PATTERN = True
note_splitter.tokens.__is_token_type(obj: Any) bool

Returns True if obj is a Token type.

Parameters

obj (Any) – The object to test.

note_splitter.tokens._get_indentation_level(line: str) int

Counts the spaces at the start of the line.

If there are tabs instead, each tab is counted as 4 spaces. This function assumes tabs and spaces are not mixed.

note_splitter.tokens.get_all_token_types(tokens_module: module) list[type[note_splitter.tokens.Token]]

Gets the list of all token types.

Call the function like this: tokens.get_all_token_types(tokens).

Parameters

tokens_module (ModuleType) – The module containing the token types. There is only one correct argument. The only reason why the argument is required is because there doesn’t seem to be any other way to automatically get the list of token types from within the file they are in.