note_splitter.lexer module
For splitting raw text into a list of tokens.
The lexer categorizes lines of text first without looking at their context. For example, a markdown codeblock will become two code fence tokens surrounding one or more tokens of any type, possibly “incorrect” types such as header. Then the lexer makes a quick pass over the token list while looking at each token’s context to ensure they have the correct type.
- class note_splitter.lexer.Lexer
Bases:
object
Creates a Callable that converts raw text to a list of tokens.
- __call__(text: str) list[note_splitter.tokens.Token]
Converts raw text to a list of tokens.
- Parameters
text (str) – The raw text to convert to a list of tokens.