Skip to content

Commit f076da6

Browse files
committed
[AsmParser][SystemZ][z/OS] Introducing HLASM Parser support to AsmParser - Part 1
- This patch (is one in a series of patches) which introduces HLASM Parser support (for the first parameter of inline asm statements) to LLVM ([[ https://lists.llvm.org/pipermail/llvm-dev/2021-January/147686.html | main RFC here ]]) - This patch in particular introduces HLASM Parser support for Z machine instructions. - The approach taken here was to subclass `AsmParser`, and make various functions and variables as "protected" wherever appropriate. - The `HLASMAsmParser` class overrides the `parseStatement` function. Two new private functions `parseAsHLASMLabel` and `parseAsMachineInstruction` are introduced as well. The general syntax is laid out as follows (more information available in [[ https://www.ibm.com/support/knowledgecenter/SSENW6_1.6.0/com.ibm.hlasm.v1r6.asm/asmr1023.pdf | HLASM V1R6 Language Reference Manual ]] - Chapter 2 - Instruction Statement Format): ``` <TokA><spaces.*><TokB><spaces.*><TokC><spaces.*><TokD> ``` 1. TokA is referred to as the Name Entry. This token is optional 2. TokB is referred to as the Operation Entry. This token is mandatory. 3. TokC is referred to as the Operand Entry. This token is mandatory 4. TokD is referred to as the Remarks Entry. This token is optional - If TokA is provided, then we either parse TokA as a possible comment or as a label (Name Entry), Tok B as the Operation Entry and so on. - If TokA is not provided (i.e. we have one or more spaces and then the first token), then we will parse the first token (i.e TokB) as a possible Z machine instruction, TokC as the operands to the Z machine instruction and TokD as a possible Remark field - TokC (Operand Entry), no spaces are allowed between OperandEntries. If a space occurs it is classified as an error. - TokD if provided is taken as is, and emitted as a comment. The following additional approach was examined, but not taken: - Adding custom private only functions to base AsmParser class, and only invoking them for z/OS. While this would eliminate the need for another child class, these private functions would be of non-use to every other target. Similarly, adding any pure virtual functions to the base MCAsmParser class and overriding them in AsmParser would also have the same disadvantage. Testing: - This patch doesn't have tests added with it, for the sole reason that MCStreamer Support and Object File support hasn't been added for the z/OS target (yet). Hence, it's not possible generate code outright for the z/OS target. They are in the process of being committed / process of being worked on. - Any comments / feedback on how to combat this "lack of testing" due to other missing required features is appreciated. Reviewed By: Kai, uweigand Differential Revision: https://reviews.llvm.org/D98276
1 parent 6c83d4a commit f076da6

File tree

5 files changed

+162
-7
lines changed

5 files changed

+162
-7
lines changed

llvm/include/llvm/MC/MCAsmInfo.h

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -146,6 +146,10 @@ class MCAsmInfo {
146146
/// Default is true.
147147
bool AllowAdditionalComments = true;
148148

149+
/// Should we emit the '\t' as the starting indentation marker for GNU inline
150+
/// asm statements. Defaults to true.
151+
bool EmitGNUAsmStartIndentationMarker = true;
152+
149153
/// This is appended to emitted labels. Defaults to ":"
150154
const char *LabelSuffix;
151155

@@ -624,6 +628,9 @@ class MCAsmInfo {
624628
return RestrictCommentStringToStartOfStatement;
625629
}
626630
bool shouldAllowAdditionalComments() const { return AllowAdditionalComments; }
631+
bool getEmitGNUAsmStartIndentationMarker() const {
632+
return EmitGNUAsmStartIndentationMarker;
633+
}
627634
const char *getLabelSuffix() const { return LabelSuffix; }
628635

629636
bool useAssignmentForEHBegin() const { return UseAssignmentForEHBegin; }

llvm/lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -270,14 +270,16 @@ static void EmitMSInlineAsmStr(const char *AsmStr, const MachineInstr *MI,
270270
}
271271

272272
static void EmitGCCInlineAsmStr(const char *AsmStr, const MachineInstr *MI,
273-
MachineModuleInfo *MMI, int AsmPrinterVariant,
273+
MachineModuleInfo *MMI, const MCAsmInfo *MAI,
274274
AsmPrinter *AP, unsigned LocCookie,
275275
raw_ostream &OS) {
276276
int CurVariant = -1; // The number of the {.|.|.} region we are in.
277277
const char *LastEmitted = AsmStr; // One past the last character emitted.
278278
unsigned NumOperands = MI->getNumOperands();
279+
int AsmPrinterVariant = MAI->getAssemblerDialect();
279280

280-
OS << '\t';
281+
if (MAI->getEmitGNUAsmStartIndentationMarker())
282+
OS << '\t';
281283

282284
while (*LastEmitted) {
283285
switch (*LastEmitted) {
@@ -499,11 +501,9 @@ void AsmPrinter::emitInlineAsm(const MachineInstr *MI) const {
499501
SmallString<256> StringData;
500502
raw_svector_ostream OS(StringData);
501503

502-
// The variant of the current asmprinter.
503-
int AsmPrinterVariant = MAI->getAssemblerDialect();
504504
AsmPrinter *AP = const_cast<AsmPrinter*>(this);
505505
if (MI->getInlineAsmDialect() == InlineAsm::AD_ATT)
506-
EmitGCCInlineAsmStr(AsmStr, MI, MMI, AsmPrinterVariant, AP, LocCookie, OS);
506+
EmitGCCInlineAsmStr(AsmStr, MI, MMI, MAI, AP, LocCookie, OS);
507507
else
508508
EmitMSInlineAsmStr(AsmStr, MI, MMI, AP, LocCookie, OS);
509509

llvm/lib/MC/MCParser/AsmParser.cpp

Lines changed: 121 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -186,6 +186,17 @@ class AsmParser : public MCAsmParser {
186186
// Is alt macro mode enabled.
187187
bool AltMacroMode = false;
188188

189+
protected:
190+
virtual bool parseStatement(ParseStatementInfo &Info,
191+
MCAsmParserSemaCallback *SI);
192+
193+
/// This routine uses the target specific ParseInstruction function to
194+
/// parse an instruction into Operands, and then call the target specific
195+
/// MatchAndEmit function to match and emit the instruction.
196+
bool parseAndMatchAndEmitTargetInstruction(ParseStatementInfo &Info,
197+
StringRef IDVal, AsmToken ID,
198+
SMLoc IDLoc);
199+
189200
public:
190201
AsmParser(SourceMgr &SM, MCContext &Ctx, MCStreamer &Out,
191202
const MCAsmInfo &MAI, unsigned CB);
@@ -273,8 +284,6 @@ class AsmParser : public MCAsmParser {
273284
/// }
274285

275286
private:
276-
bool parseStatement(ParseStatementInfo &Info,
277-
MCAsmParserSemaCallback *SI);
278287
bool parseCurlyBlockScope(SmallVectorImpl<AsmRewrite>& AsmStrRewrites);
279288
bool parseCppHashLineFilenameComment(SMLoc L, bool SaveLocInfo = true);
280289

@@ -701,6 +710,36 @@ class AsmParser : public MCAsmParser {
701710
void initializeCVDefRangeTypeMap();
702711
};
703712

713+
class HLASMAsmParser final : public AsmParser {
714+
private:
715+
MCAsmLexer &Lexer;
716+
MCStreamer &Out;
717+
718+
void lexLeadingSpaces() {
719+
while (Lexer.is(AsmToken::Space))
720+
Lexer.Lex();
721+
}
722+
723+
bool parseAsHLASMLabel(ParseStatementInfo &Info, MCAsmParserSemaCallback *SI);
724+
bool parseAsMachineInstruction(ParseStatementInfo &Info,
725+
MCAsmParserSemaCallback *SI);
726+
727+
public:
728+
HLASMAsmParser(SourceMgr &SM, MCContext &Ctx, MCStreamer &Out,
729+
const MCAsmInfo &MAI, unsigned CB = 0)
730+
: AsmParser(SM, Ctx, Out, MAI, CB), Lexer(getLexer()), Out(Out) {
731+
Lexer.setSkipSpace(false);
732+
Lexer.setAllowHashInIdentifier(true);
733+
Lexer.setLexHLASMIntegers(true);
734+
Lexer.setLexHLASMStrings(true);
735+
}
736+
737+
~HLASMAsmParser() { Lexer.setSkipSpace(true); }
738+
739+
bool parseStatement(ParseStatementInfo &Info,
740+
MCAsmParserSemaCallback *SI) override;
741+
};
742+
704743
} // end anonymous namespace
705744

706745
namespace llvm {
@@ -2257,6 +2296,13 @@ bool AsmParser::parseStatement(ParseStatementInfo &Info,
22572296
if (checkForValidSection())
22582297
return true;
22592298

2299+
return parseAndMatchAndEmitTargetInstruction(Info, IDVal, ID, IDLoc);
2300+
}
2301+
2302+
bool AsmParser::parseAndMatchAndEmitTargetInstruction(ParseStatementInfo &Info,
2303+
StringRef IDVal,
2304+
AsmToken ID,
2305+
SMLoc IDLoc) {
22602306
// Canonicalize the opcode to lower case.
22612307
std::string OpcodeStr = IDVal.lower();
22622308
ParseInstructionInfo IInfo(Info.AsmRewrites);
@@ -6124,6 +6170,76 @@ bool AsmParser::parseMSInlineAsm(
61246170
return false;
61256171
}
61266172

6173+
bool HLASMAsmParser::parseAsMachineInstruction(ParseStatementInfo &Info,
6174+
MCAsmParserSemaCallback *SI) {
6175+
AsmToken OperationEntryTok = Lexer.getTok();
6176+
SMLoc OperationEntryLoc = OperationEntryTok.getLoc();
6177+
StringRef OperationEntryVal;
6178+
6179+
// If we see a new line or carriage return, emit the new line
6180+
// and lex it.
6181+
if (OperationEntryTok.is(AsmToken::EndOfStatement)) {
6182+
if (getTok().getString().front() == '\n' ||
6183+
getTok().getString().front() == '\r') {
6184+
Out.AddBlankLine();
6185+
Lex();
6186+
return false;
6187+
}
6188+
}
6189+
6190+
// Attempt to parse the first token as an Identifier
6191+
if (parseIdentifier(OperationEntryVal))
6192+
return Error(OperationEntryLoc, "unexpected token at start of statement");
6193+
6194+
// Once we've parsed the operation entry successfully, lex
6195+
// any spaces to get to the OperandEntries.
6196+
lexLeadingSpaces();
6197+
6198+
return parseAndMatchAndEmitTargetInstruction(
6199+
Info, OperationEntryVal, OperationEntryTok, OperationEntryLoc);
6200+
}
6201+
6202+
bool HLASMAsmParser::parseStatement(ParseStatementInfo &Info,
6203+
MCAsmParserSemaCallback *SI) {
6204+
assert(!hasPendingError() && "parseStatement started with pending error");
6205+
6206+
// Should the first token be interpreted as a machine instruction.
6207+
bool ShouldParseAsMachineInstruction = false;
6208+
6209+
// If a Name Entry exists, it should occur at the very
6210+
// start of the string. In this case, we should parse the
6211+
// first non-space token as a Label.
6212+
// If the Name entry is missing (i.e. there's some other
6213+
// token), then we attempt to parse the first non-space
6214+
// token as a Machine Instruction.
6215+
if (getTok().is(AsmToken::Space))
6216+
ShouldParseAsMachineInstruction = true;
6217+
6218+
// If we have an EndOfStatement (which includes the target's comment
6219+
// string) we can appropriately lex it early on)
6220+
if (Lexer.is(AsmToken::EndOfStatement)) {
6221+
// if this is a line comment we can drop it safely
6222+
if (getTok().getString().empty() || getTok().getString().front() == '\r' ||
6223+
getTok().getString().front() == '\n')
6224+
Out.AddBlankLine();
6225+
Lex();
6226+
return false;
6227+
}
6228+
6229+
// We have established how to parse the inline asm statement.
6230+
// Now we can safely lex any leading spaces to get to the
6231+
// first token.
6232+
lexLeadingSpaces();
6233+
6234+
if (ShouldParseAsMachineInstruction)
6235+
return parseAsMachineInstruction(Info, SI);
6236+
6237+
// Label parsing support isn't implemented completely (yet).
6238+
SMLoc Loc = getTok().getLoc();
6239+
eatToEndOfStatement();
6240+
return Error(Loc, "HLASM Label parsing support not yet implemented");
6241+
}
6242+
61276243
namespace llvm {
61286244
namespace MCParserUtils {
61296245

@@ -6211,5 +6327,8 @@ bool parseAssignmentExpression(StringRef Name, bool allow_redef,
62116327
MCAsmParser *llvm::createMCAsmParser(SourceMgr &SM, MCContext &C,
62126328
MCStreamer &Out, const MCAsmInfo &MAI,
62136329
unsigned CB) {
6330+
if (C.getTargetTriple().isOSzOS())
6331+
return new HLASMAsmParser(SM, C, Out, MAI, CB);
6332+
62146333
return new AsmParser(SM, C, Out, MAI, CB);
62156334
}

llvm/lib/Target/SystemZ/AsmParser/SystemZAsmParser.cpp

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1381,10 +1381,38 @@ bool SystemZAsmParser::ParseInstruction(ParseInstructionInfo &Info,
13811381
// Read any subsequent operands.
13821382
while (getLexer().is(AsmToken::Comma)) {
13831383
Parser.Lex();
1384+
1385+
if (isParsingHLASM() && getLexer().is(AsmToken::Space))
1386+
return Error(
1387+
Parser.getTok().getLoc(),
1388+
"No space allowed between comma that separates operand entries");
1389+
13841390
if (parseOperand(Operands, Name)) {
13851391
return true;
13861392
}
13871393
}
1394+
1395+
// Under the HLASM variant, we could have the remark field
1396+
// The remark field occurs after the operation entries
1397+
// There is a space that separates the operation entries and the
1398+
// remark field.
1399+
if (isParsingHLASM() && getTok().is(AsmToken::Space)) {
1400+
// We've confirmed that there is a Remark field.
1401+
StringRef Remark(getLexer().LexUntilEndOfStatement());
1402+
Parser.Lex();
1403+
1404+
// If there is nothing after the space, then there is nothing to emit
1405+
// We could have a situation as this:
1406+
// " \n"
1407+
// After lexing above, we will have
1408+
// "\n"
1409+
// This isn't an explicit remark field, so we don't have to output
1410+
// this as a comment.
1411+
if (Remark.size())
1412+
// Output the entire Remarks Field as a comment
1413+
getStreamer().AddComment(Remark);
1414+
}
1415+
13881416
if (getLexer().isNot(AsmToken::EndOfStatement)) {
13891417
SMLoc Loc = getLexer().getLoc();
13901418
return Error(Loc, "unexpected token in argument list");

llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCAsmInfo.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ SystemZMCAsmInfo::SystemZMCAsmInfo(const Triple &TT) {
2929
AllowHashAtStartOfIdentifier = (AssemblerDialect == AD_HLASM);
3030
DotIsPC = (AssemblerDialect == AD_ATT);
3131
StarIsPC = (AssemblerDialect == AD_HLASM);
32+
EmitGNUAsmStartIndentationMarker = (AssemblerDialect == AD_ATT);
3233

3334
ZeroDirective = "\t.space\t";
3435
Data64bitsDirective = "\t.quad\t";

0 commit comments

Comments
 (0)