What is the regular expression in LEX to match a string NOT starting with newline












0















I want to know the regular expression in lex to match a string which does not (start the line and is followed by optional white spaces followed by "a="). I am trying to parse a language with the following types of lines:



     a=some value
b=some value


The strings "a=" (b=, etc.) can be preceded by white spaces and followed by another string without any white spaces after the = and upto newline. For example:



     a=123 abcde


Here "123 abcde" is the value.
Problem is that I may encounter, at least in theory, the following



     a=123 a= 


Or worse:



     a=a=


Where the first a= is the key and the second a= is now part of the value and not the key attribute. How do I distinguish the first a= token from the second?



I can match the key "a=" with the following which handles leading whitespace:



    ^[ rt]*"a="  


But how do I match the second string? I need a regular expression of the type that says match a string that does NOT (start the line and is followed by optional whitespaces followed by a=) and extends upto newline character. The main trick is to avoid the expression matching the attribute a= also.










share|improve this question























  • @wp78de Thanks but unfortunately it does not even compile in lex. For starters, the ^ at the start means starting the line and that is precisely what I do not want. The ^ has a different interpretation in most regular expressions. In Lex it negates only in a class of characters (for example, [^a-z]) otherwise interpreted as beginning of line (just as $ signifies end of line). There are other syntactical issues.

    – RTC
    Dec 28 '18 at 20:27













  • Arguably, flex is not the ideal tool for this problem, but I don't know exactly what you are trying to do.

    – rici
    Dec 28 '18 at 21:22
















0















I want to know the regular expression in lex to match a string which does not (start the line and is followed by optional white spaces followed by "a="). I am trying to parse a language with the following types of lines:



     a=some value
b=some value


The strings "a=" (b=, etc.) can be preceded by white spaces and followed by another string without any white spaces after the = and upto newline. For example:



     a=123 abcde


Here "123 abcde" is the value.
Problem is that I may encounter, at least in theory, the following



     a=123 a= 


Or worse:



     a=a=


Where the first a= is the key and the second a= is now part of the value and not the key attribute. How do I distinguish the first a= token from the second?



I can match the key "a=" with the following which handles leading whitespace:



    ^[ rt]*"a="  


But how do I match the second string? I need a regular expression of the type that says match a string that does NOT (start the line and is followed by optional whitespaces followed by a=) and extends upto newline character. The main trick is to avoid the expression matching the attribute a= also.










share|improve this question























  • @wp78de Thanks but unfortunately it does not even compile in lex. For starters, the ^ at the start means starting the line and that is precisely what I do not want. The ^ has a different interpretation in most regular expressions. In Lex it negates only in a class of characters (for example, [^a-z]) otherwise interpreted as beginning of line (just as $ signifies end of line). There are other syntactical issues.

    – RTC
    Dec 28 '18 at 20:27













  • Arguably, flex is not the ideal tool for this problem, but I don't know exactly what you are trying to do.

    – rici
    Dec 28 '18 at 21:22














0












0








0








I want to know the regular expression in lex to match a string which does not (start the line and is followed by optional white spaces followed by "a="). I am trying to parse a language with the following types of lines:



     a=some value
b=some value


The strings "a=" (b=, etc.) can be preceded by white spaces and followed by another string without any white spaces after the = and upto newline. For example:



     a=123 abcde


Here "123 abcde" is the value.
Problem is that I may encounter, at least in theory, the following



     a=123 a= 


Or worse:



     a=a=


Where the first a= is the key and the second a= is now part of the value and not the key attribute. How do I distinguish the first a= token from the second?



I can match the key "a=" with the following which handles leading whitespace:



    ^[ rt]*"a="  


But how do I match the second string? I need a regular expression of the type that says match a string that does NOT (start the line and is followed by optional whitespaces followed by a=) and extends upto newline character. The main trick is to avoid the expression matching the attribute a= also.










share|improve this question














I want to know the regular expression in lex to match a string which does not (start the line and is followed by optional white spaces followed by "a="). I am trying to parse a language with the following types of lines:



     a=some value
b=some value


The strings "a=" (b=, etc.) can be preceded by white spaces and followed by another string without any white spaces after the = and upto newline. For example:



     a=123 abcde


Here "123 abcde" is the value.
Problem is that I may encounter, at least in theory, the following



     a=123 a= 


Or worse:



     a=a=


Where the first a= is the key and the second a= is now part of the value and not the key attribute. How do I distinguish the first a= token from the second?



I can match the key "a=" with the following which handles leading whitespace:



    ^[ rt]*"a="  


But how do I match the second string? I need a regular expression of the type that says match a string that does NOT (start the line and is followed by optional whitespaces followed by a=) and extends upto newline character. The main trick is to avoid the expression matching the attribute a= also.







regex lex






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Dec 28 '18 at 19:38









RTCRTC

499




499













  • @wp78de Thanks but unfortunately it does not even compile in lex. For starters, the ^ at the start means starting the line and that is precisely what I do not want. The ^ has a different interpretation in most regular expressions. In Lex it negates only in a class of characters (for example, [^a-z]) otherwise interpreted as beginning of line (just as $ signifies end of line). There are other syntactical issues.

    – RTC
    Dec 28 '18 at 20:27













  • Arguably, flex is not the ideal tool for this problem, but I don't know exactly what you are trying to do.

    – rici
    Dec 28 '18 at 21:22



















  • @wp78de Thanks but unfortunately it does not even compile in lex. For starters, the ^ at the start means starting the line and that is precisely what I do not want. The ^ has a different interpretation in most regular expressions. In Lex it negates only in a class of characters (for example, [^a-z]) otherwise interpreted as beginning of line (just as $ signifies end of line). There are other syntactical issues.

    – RTC
    Dec 28 '18 at 20:27













  • Arguably, flex is not the ideal tool for this problem, but I don't know exactly what you are trying to do.

    – rici
    Dec 28 '18 at 21:22

















@wp78de Thanks but unfortunately it does not even compile in lex. For starters, the ^ at the start means starting the line and that is precisely what I do not want. The ^ has a different interpretation in most regular expressions. In Lex it negates only in a class of characters (for example, [^a-z]) otherwise interpreted as beginning of line (just as $ signifies end of line). There are other syntactical issues.

– RTC
Dec 28 '18 at 20:27







@wp78de Thanks but unfortunately it does not even compile in lex. For starters, the ^ at the start means starting the line and that is precisely what I do not want. The ^ has a different interpretation in most regular expressions. In Lex it negates only in a class of characters (for example, [^a-z]) otherwise interpreted as beginning of line (just as $ signifies end of line). There are other syntactical issues.

– RTC
Dec 28 '18 at 20:27















Arguably, flex is not the ideal tool for this problem, but I don't know exactly what you are trying to do.

– rici
Dec 28 '18 at 21:22





Arguably, flex is not the ideal tool for this problem, but I don't know exactly what you are trying to do.

– rici
Dec 28 '18 at 21:22












1 Answer
1






active

oldest

votes


















2














Use a start condition to create a different lexical context for the input after the =.



Lex works best with a language in which tokenisation is not context-dependent (most programming languages but few ad hoc interchange formats). But start conditions are manageable if you don't have too many contexts to juggle.



See the manual for details and examples.



Simple example:



%x RHS
%%
[[:space:]]+ ; /* Ignore leading white space and blank lines */
a= { BEGIN(RHS); return TOKEN_A; }
b= { BEGIN(RHS); return TOKEN_B; }
.* ; /* Ignore other input. Should do something else */
<RHS>.+ { yylval = strdup(yytext); return VALUE; }
<RHS>n { BEGIN(INITIAL); }


Note: The RHS rules send nothing if there is no value. That shouldn't be a problem for a parser but if it is, you can fix it reasonably easily.






share|improve this answer


























  • Support for start conditions opens a lot of possibilities that otherwise would be very difficult or impossible to implement. Thanks for showing the way...

    – RTC
    Jan 1 at 4:17











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53963534%2fwhat-is-the-regular-expression-in-lex-to-match-a-string-not-starting-with-newlin%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









2














Use a start condition to create a different lexical context for the input after the =.



Lex works best with a language in which tokenisation is not context-dependent (most programming languages but few ad hoc interchange formats). But start conditions are manageable if you don't have too many contexts to juggle.



See the manual for details and examples.



Simple example:



%x RHS
%%
[[:space:]]+ ; /* Ignore leading white space and blank lines */
a= { BEGIN(RHS); return TOKEN_A; }
b= { BEGIN(RHS); return TOKEN_B; }
.* ; /* Ignore other input. Should do something else */
<RHS>.+ { yylval = strdup(yytext); return VALUE; }
<RHS>n { BEGIN(INITIAL); }


Note: The RHS rules send nothing if there is no value. That shouldn't be a problem for a parser but if it is, you can fix it reasonably easily.






share|improve this answer


























  • Support for start conditions opens a lot of possibilities that otherwise would be very difficult or impossible to implement. Thanks for showing the way...

    – RTC
    Jan 1 at 4:17
















2














Use a start condition to create a different lexical context for the input after the =.



Lex works best with a language in which tokenisation is not context-dependent (most programming languages but few ad hoc interchange formats). But start conditions are manageable if you don't have too many contexts to juggle.



See the manual for details and examples.



Simple example:



%x RHS
%%
[[:space:]]+ ; /* Ignore leading white space and blank lines */
a= { BEGIN(RHS); return TOKEN_A; }
b= { BEGIN(RHS); return TOKEN_B; }
.* ; /* Ignore other input. Should do something else */
<RHS>.+ { yylval = strdup(yytext); return VALUE; }
<RHS>n { BEGIN(INITIAL); }


Note: The RHS rules send nothing if there is no value. That shouldn't be a problem for a parser but if it is, you can fix it reasonably easily.






share|improve this answer


























  • Support for start conditions opens a lot of possibilities that otherwise would be very difficult or impossible to implement. Thanks for showing the way...

    – RTC
    Jan 1 at 4:17














2












2








2







Use a start condition to create a different lexical context for the input after the =.



Lex works best with a language in which tokenisation is not context-dependent (most programming languages but few ad hoc interchange formats). But start conditions are manageable if you don't have too many contexts to juggle.



See the manual for details and examples.



Simple example:



%x RHS
%%
[[:space:]]+ ; /* Ignore leading white space and blank lines */
a= { BEGIN(RHS); return TOKEN_A; }
b= { BEGIN(RHS); return TOKEN_B; }
.* ; /* Ignore other input. Should do something else */
<RHS>.+ { yylval = strdup(yytext); return VALUE; }
<RHS>n { BEGIN(INITIAL); }


Note: The RHS rules send nothing if there is no value. That shouldn't be a problem for a parser but if it is, you can fix it reasonably easily.






share|improve this answer















Use a start condition to create a different lexical context for the input after the =.



Lex works best with a language in which tokenisation is not context-dependent (most programming languages but few ad hoc interchange formats). But start conditions are manageable if you don't have too many contexts to juggle.



See the manual for details and examples.



Simple example:



%x RHS
%%
[[:space:]]+ ; /* Ignore leading white space and blank lines */
a= { BEGIN(RHS); return TOKEN_A; }
b= { BEGIN(RHS); return TOKEN_B; }
.* ; /* Ignore other input. Should do something else */
<RHS>.+ { yylval = strdup(yytext); return VALUE; }
<RHS>n { BEGIN(INITIAL); }


Note: The RHS rules send nothing if there is no value. That shouldn't be a problem for a parser but if it is, you can fix it reasonably easily.







share|improve this answer














share|improve this answer



share|improve this answer








edited Dec 28 '18 at 22:03

























answered Dec 28 '18 at 21:19









ricirici

153k19135200




153k19135200













  • Support for start conditions opens a lot of possibilities that otherwise would be very difficult or impossible to implement. Thanks for showing the way...

    – RTC
    Jan 1 at 4:17



















  • Support for start conditions opens a lot of possibilities that otherwise would be very difficult or impossible to implement. Thanks for showing the way...

    – RTC
    Jan 1 at 4:17

















Support for start conditions opens a lot of possibilities that otherwise would be very difficult or impossible to implement. Thanks for showing the way...

– RTC
Jan 1 at 4:17





Support for start conditions opens a lot of possibilities that otherwise would be very difficult or impossible to implement. Thanks for showing the way...

– RTC
Jan 1 at 4:17


















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53963534%2fwhat-is-the-regular-expression-in-lex-to-match-a-string-not-starting-with-newlin%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Angular Downloading a file using contenturl with Basic Authentication

Olmecas

Can't read property showImagePicker of undefined in react native iOS