Change Shape parsing from regexp matcher to parser.
Previously in the HLO parser/lexer shapes were tokens which were identified using a complicated regular expression. This made augmenting the textual form of shape difficult such as would be necessary for dynamic shapes or tiling. To avoid ambiguity and other problems a couple changes were made to HLO textual form, as well as some related clean up:
(1) Do not redundantly print the shape inside of the constant HLO instruction's "operand" field. Previously, constant instructions we printed like:
S32[2,2] constant(S32[2,2] {{1,2},{3,4}})
Now this is printed as:
S32[2,2] constant({{1,2},{3,4}})
This avoids an ambiguity where the values of the literal can be misinterpreted as a layout. Also, the shape was printed inconsistently: only when the rank was greater than one.
(2) Remove ShapeUtil::ParseShapeString, replace with ParseShape function in hlo parser.
(3) Merge hlo_token.h into hlo_lexer.h. It is only used by the lexer and parser which include that file and avoids potential confusion with the token HLO type
(4) Fix b/112302613 by removing the unused Shape field in the sharding attribute of HLO text.
(5) As part of this change primitive element types are now keywords which simplifies parsing. The fallout is that a bunch of values in HLO text named "token" had to be renamed. Also, change the HLO name sanitizer to avoid these primitive type keywords.
PiperOrigin-RevId: 225546437
Loading
Please sign in to comment