ml_switcheroo.core.mlir.parser¶

MLIR Recursive Descent Parser.

This module parses text-based MLIR code into the CST object model defined in nodes.py. It is designed to preserve trivia (comments/whitespace) to support high-fidelity round-trip transformations.

Classes¶

`Token`	Represents a lexical token extracted from the source string.
`Tokenizer`	Lexical analyzer for MLIR syntax.
`MlirParser`	Parses a stream of MLIR tokens into a Concrete Syntax Tree.

Module Contents¶

class ml_switcheroo.core.mlir.parser.Token[source]¶

Represents a lexical token extracted from the source string.

kind: str¶

text: str¶

line: int¶

col: int¶

class ml_switcheroo.core.mlir.parser.Tokenizer(text: str)[source]¶

Lexical analyzer for MLIR syntax.

Splits the input string into a stream of typed Tokens based on regex patterns.

PATTERN_DEFS¶

text¶

tokenize() → Generator[Token, None, None][source]¶

Yields tokens from the source text one by one.

Yields:: Token – The next lexical token.
Raises:: ValueError – If an unrecognized character sequence is encountered.

class ml_switcheroo.core.mlir.parser.MlirParser(text: str)[source]¶

Parses a stream of MLIR tokens into a Concrete Syntax Tree.

Implements recursive descent logic to handle Modules, Blocks, Operations, and Regions while preserving whitespace and comments (trivia) for accurate reproduction.

tokenizer¶

tokens¶

pos = 0¶

trivia_buffer: List[ml_switcheroo.core.mlir.nodes.TriviaNode] = []¶

peek(offset: int = 0) → Token[source]¶

Look ahead at a token without consuming it.

Parameters:: offset (int) – Number of tokens to look ahead. Defaults to 0 (current).
Returns:: The token at the lookahead position.
Return type:: Token

consume() → Token[source]¶

Consumes and returns the current token, advancing the pointer.

Returns:: The consumed token.
Return type:: Token

match(kind: str) → bool[source]¶

Checks if the current token matches the specified kind or text.

Parameters:: kind (str) – The token kind (e.g. TokenKind.VAL_ID) or specific symbol text (e.g. ‘{‘).
Returns:: True if the current token matches.
Return type:: bool

expect(kind: str) → Token[source]¶

Consume the current token if it matches kind, otherwise raise SyntaxError.

Parameters:: kind (str) – The expected token kind or text.
Returns:: The consumed token.
Return type:: Token
Raises:: SyntaxError – If the current token does not match the expectation.

parse() → ml_switcheroo.core.mlir.nodes.ModuleNode[source]¶

Top-level parsing entry point.

Returns:: The root of the MLIR CST.
Return type:: ModuleNode

parse_block(is_top_level: bool = False) → ml_switcheroo.core.mlir.nodes.BlockNode[source]¶

Parses a Basic Block.

A block consists of an optional label (with arguments) and a list of operations.

Parameters:: is_top_level (bool) – If True, treats the input as an implicit top-level module block which may not have a label or braces.
Returns:: The parsed block structure.
Return type:: BlockNode
Raises:: SyntaxError – If invalid tokens are encountered where an operation was expected.

parse_operation() → ml_switcheroo.core.mlir.nodes.OperationNode | None[source]¶

Parses a single MLIR Operation.

Structure: %results = “op.name”(%operands) {attributes} ({regions}) : type

Returns:: The parsed operation, or None if no valid op start found.
Return type:: Optional[OperationNode]
Raises:: SyntaxError – If structural expectations (e.g. closing parens) are unmet.

parse_region() → ml_switcheroo.core.mlir.nodes.RegionNode[source]¶

Parses a Region containing nested Blocks.

Enclosed in curly braces { … }.

Returns:: The parsed region.
Return type:: RegionNode