What is the regular expression in LEX to match a string NOT starting with newline
I want to know the regular expression in lex to match a string which does not (start the line and is followed by optional white spaces followed by "a="). I am trying to parse a language with the following types of lines:
a=some value
b=some value
The strings "a=" (b=, etc.) can be preceded by white spaces and followed by another string without any white spaces after the = and upto newline. For example:
a=123 abcde
Here "123 abcde" is the value.
Problem is that I may encounter, at least in theory, the following
a=123 a=
Or worse:
a=a=
Where the first a= is the key and the second a= is now part of the value and not the key attribute. How do I distinguish the first a= token from the second?
I can match the key "a=" with the following which handles leading whitespace:
^[ rt]*"a="
But how do I match the second string? I need a regular expression of the type that says match a string that does NOT (start the line and is followed by optional whitespaces followed by a=) and extends upto newline character. The main trick is to avoid the expression matching the attribute a= also.
regex lex
add a comment |
I want to know the regular expression in lex to match a string which does not (start the line and is followed by optional white spaces followed by "a="). I am trying to parse a language with the following types of lines:
a=some value
b=some value
The strings "a=" (b=, etc.) can be preceded by white spaces and followed by another string without any white spaces after the = and upto newline. For example:
a=123 abcde
Here "123 abcde" is the value.
Problem is that I may encounter, at least in theory, the following
a=123 a=
Or worse:
a=a=
Where the first a= is the key and the second a= is now part of the value and not the key attribute. How do I distinguish the first a= token from the second?
I can match the key "a=" with the following which handles leading whitespace:
^[ rt]*"a="
But how do I match the second string? I need a regular expression of the type that says match a string that does NOT (start the line and is followed by optional whitespaces followed by a=) and extends upto newline character. The main trick is to avoid the expression matching the attribute a= also.
regex lex
@wp78de Thanks but unfortunately it does not even compile in lex. For starters, the ^ at the start means starting the line and that is precisely what I do not want. The ^ has a different interpretation in most regular expressions. In Lex it negates only in a class of characters (for example, [^a-z]) otherwise interpreted as beginning of line (just as $ signifies end of line). There are other syntactical issues.
– RTC
Dec 28 '18 at 20:27
Arguably, flex is not the ideal tool for this problem, but I don't know exactly what you are trying to do.
– rici
Dec 28 '18 at 21:22
add a comment |
I want to know the regular expression in lex to match a string which does not (start the line and is followed by optional white spaces followed by "a="). I am trying to parse a language with the following types of lines:
a=some value
b=some value
The strings "a=" (b=, etc.) can be preceded by white spaces and followed by another string without any white spaces after the = and upto newline. For example:
a=123 abcde
Here "123 abcde" is the value.
Problem is that I may encounter, at least in theory, the following
a=123 a=
Or worse:
a=a=
Where the first a= is the key and the second a= is now part of the value and not the key attribute. How do I distinguish the first a= token from the second?
I can match the key "a=" with the following which handles leading whitespace:
^[ rt]*"a="
But how do I match the second string? I need a regular expression of the type that says match a string that does NOT (start the line and is followed by optional whitespaces followed by a=) and extends upto newline character. The main trick is to avoid the expression matching the attribute a= also.
regex lex
I want to know the regular expression in lex to match a string which does not (start the line and is followed by optional white spaces followed by "a="). I am trying to parse a language with the following types of lines:
a=some value
b=some value
The strings "a=" (b=, etc.) can be preceded by white spaces and followed by another string without any white spaces after the = and upto newline. For example:
a=123 abcde
Here "123 abcde" is the value.
Problem is that I may encounter, at least in theory, the following
a=123 a=
Or worse:
a=a=
Where the first a= is the key and the second a= is now part of the value and not the key attribute. How do I distinguish the first a= token from the second?
I can match the key "a=" with the following which handles leading whitespace:
^[ rt]*"a="
But how do I match the second string? I need a regular expression of the type that says match a string that does NOT (start the line and is followed by optional whitespaces followed by a=) and extends upto newline character. The main trick is to avoid the expression matching the attribute a= also.
regex lex
regex lex
asked Dec 28 '18 at 19:38
RTCRTC
499
499
@wp78de Thanks but unfortunately it does not even compile in lex. For starters, the ^ at the start means starting the line and that is precisely what I do not want. The ^ has a different interpretation in most regular expressions. In Lex it negates only in a class of characters (for example, [^a-z]) otherwise interpreted as beginning of line (just as $ signifies end of line). There are other syntactical issues.
– RTC
Dec 28 '18 at 20:27
Arguably, flex is not the ideal tool for this problem, but I don't know exactly what you are trying to do.
– rici
Dec 28 '18 at 21:22
add a comment |
@wp78de Thanks but unfortunately it does not even compile in lex. For starters, the ^ at the start means starting the line and that is precisely what I do not want. The ^ has a different interpretation in most regular expressions. In Lex it negates only in a class of characters (for example, [^a-z]) otherwise interpreted as beginning of line (just as $ signifies end of line). There are other syntactical issues.
– RTC
Dec 28 '18 at 20:27
Arguably, flex is not the ideal tool for this problem, but I don't know exactly what you are trying to do.
– rici
Dec 28 '18 at 21:22
@wp78de Thanks but unfortunately it does not even compile in lex. For starters, the ^ at the start means starting the line and that is precisely what I do not want. The ^ has a different interpretation in most regular expressions. In Lex it negates only in a class of characters (for example, [^a-z]) otherwise interpreted as beginning of line (just as $ signifies end of line). There are other syntactical issues.
– RTC
Dec 28 '18 at 20:27
@wp78de Thanks but unfortunately it does not even compile in lex. For starters, the ^ at the start means starting the line and that is precisely what I do not want. The ^ has a different interpretation in most regular expressions. In Lex it negates only in a class of characters (for example, [^a-z]) otherwise interpreted as beginning of line (just as $ signifies end of line). There are other syntactical issues.
– RTC
Dec 28 '18 at 20:27
Arguably, flex is not the ideal tool for this problem, but I don't know exactly what you are trying to do.
– rici
Dec 28 '18 at 21:22
Arguably, flex is not the ideal tool for this problem, but I don't know exactly what you are trying to do.
– rici
Dec 28 '18 at 21:22
add a comment |
1 Answer
1
active
oldest
votes
Use a start condition to create a different lexical context for the input after the =
.
Lex works best with a language in which tokenisation is not context-dependent (most programming languages but few ad hoc interchange formats). But start conditions are manageable if you don't have too many contexts to juggle.
See the manual for details and examples.
Simple example:
%x RHS
%%
[[:space:]]+ ; /* Ignore leading white space and blank lines */
a= { BEGIN(RHS); return TOKEN_A; }
b= { BEGIN(RHS); return TOKEN_B; }
.* ; /* Ignore other input. Should do something else */
<RHS>.+ { yylval = strdup(yytext); return VALUE; }
<RHS>n { BEGIN(INITIAL); }
Note: The RHS rules send nothing if there is no value. That shouldn't be a problem for a parser but if it is, you can fix it reasonably easily.
Support for start conditions opens a lot of possibilities that otherwise would be very difficult or impossible to implement. Thanks for showing the way...
– RTC
Jan 1 at 4:17
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53963534%2fwhat-is-the-regular-expression-in-lex-to-match-a-string-not-starting-with-newlin%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Use a start condition to create a different lexical context for the input after the =
.
Lex works best with a language in which tokenisation is not context-dependent (most programming languages but few ad hoc interchange formats). But start conditions are manageable if you don't have too many contexts to juggle.
See the manual for details and examples.
Simple example:
%x RHS
%%
[[:space:]]+ ; /* Ignore leading white space and blank lines */
a= { BEGIN(RHS); return TOKEN_A; }
b= { BEGIN(RHS); return TOKEN_B; }
.* ; /* Ignore other input. Should do something else */
<RHS>.+ { yylval = strdup(yytext); return VALUE; }
<RHS>n { BEGIN(INITIAL); }
Note: The RHS rules send nothing if there is no value. That shouldn't be a problem for a parser but if it is, you can fix it reasonably easily.
Support for start conditions opens a lot of possibilities that otherwise would be very difficult or impossible to implement. Thanks for showing the way...
– RTC
Jan 1 at 4:17
add a comment |
Use a start condition to create a different lexical context for the input after the =
.
Lex works best with a language in which tokenisation is not context-dependent (most programming languages but few ad hoc interchange formats). But start conditions are manageable if you don't have too many contexts to juggle.
See the manual for details and examples.
Simple example:
%x RHS
%%
[[:space:]]+ ; /* Ignore leading white space and blank lines */
a= { BEGIN(RHS); return TOKEN_A; }
b= { BEGIN(RHS); return TOKEN_B; }
.* ; /* Ignore other input. Should do something else */
<RHS>.+ { yylval = strdup(yytext); return VALUE; }
<RHS>n { BEGIN(INITIAL); }
Note: The RHS rules send nothing if there is no value. That shouldn't be a problem for a parser but if it is, you can fix it reasonably easily.
Support for start conditions opens a lot of possibilities that otherwise would be very difficult or impossible to implement. Thanks for showing the way...
– RTC
Jan 1 at 4:17
add a comment |
Use a start condition to create a different lexical context for the input after the =
.
Lex works best with a language in which tokenisation is not context-dependent (most programming languages but few ad hoc interchange formats). But start conditions are manageable if you don't have too many contexts to juggle.
See the manual for details and examples.
Simple example:
%x RHS
%%
[[:space:]]+ ; /* Ignore leading white space and blank lines */
a= { BEGIN(RHS); return TOKEN_A; }
b= { BEGIN(RHS); return TOKEN_B; }
.* ; /* Ignore other input. Should do something else */
<RHS>.+ { yylval = strdup(yytext); return VALUE; }
<RHS>n { BEGIN(INITIAL); }
Note: The RHS rules send nothing if there is no value. That shouldn't be a problem for a parser but if it is, you can fix it reasonably easily.
Use a start condition to create a different lexical context for the input after the =
.
Lex works best with a language in which tokenisation is not context-dependent (most programming languages but few ad hoc interchange formats). But start conditions are manageable if you don't have too many contexts to juggle.
See the manual for details and examples.
Simple example:
%x RHS
%%
[[:space:]]+ ; /* Ignore leading white space and blank lines */
a= { BEGIN(RHS); return TOKEN_A; }
b= { BEGIN(RHS); return TOKEN_B; }
.* ; /* Ignore other input. Should do something else */
<RHS>.+ { yylval = strdup(yytext); return VALUE; }
<RHS>n { BEGIN(INITIAL); }
Note: The RHS rules send nothing if there is no value. That shouldn't be a problem for a parser but if it is, you can fix it reasonably easily.
edited Dec 28 '18 at 22:03
answered Dec 28 '18 at 21:19
ricirici
153k19135200
153k19135200
Support for start conditions opens a lot of possibilities that otherwise would be very difficult or impossible to implement. Thanks for showing the way...
– RTC
Jan 1 at 4:17
add a comment |
Support for start conditions opens a lot of possibilities that otherwise would be very difficult or impossible to implement. Thanks for showing the way...
– RTC
Jan 1 at 4:17
Support for start conditions opens a lot of possibilities that otherwise would be very difficult or impossible to implement. Thanks for showing the way...
– RTC
Jan 1 at 4:17
Support for start conditions opens a lot of possibilities that otherwise would be very difficult or impossible to implement. Thanks for showing the way...
– RTC
Jan 1 at 4:17
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53963534%2fwhat-is-the-regular-expression-in-lex-to-match-a-string-not-starting-with-newlin%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
@wp78de Thanks but unfortunately it does not even compile in lex. For starters, the ^ at the start means starting the line and that is precisely what I do not want. The ^ has a different interpretation in most regular expressions. In Lex it negates only in a class of characters (for example, [^a-z]) otherwise interpreted as beginning of line (just as $ signifies end of line). There are other syntactical issues.
– RTC
Dec 28 '18 at 20:27
Arguably, flex is not the ideal tool for this problem, but I don't know exactly what you are trying to do.
– rici
Dec 28 '18 at 21:22