What is the regular expression in LEX to match a string NOT starting with newline

I want to know the regular expression in lex to match a string which does not (start the line and is followed by optional white spaces followed by "a="). I am trying to parse a language with the following types of lines:

     a=some value

     b=some value

The strings "a=" (b=, etc.) can be preceded by white spaces and followed by another string without any white spaces after the = and upto newline. For example:

     a=123 abcde

Here "123 abcde" is the value.
Problem is that I may encounter, at least in theory, the following

     a=123 a=

Or worse:

     a=a=

Where the first a= is the key and the second a= is now part of the value and not the key attribute. How do I distinguish the first a= token from the second?

I can match the key "a=" with the following which handles leading whitespace:

    ^[ rt]*"a="

But how do I match the second string? I need a regular expression of the type that says match a string that does NOT (start the line and is followed by optional whitespaces followed by a=) and extends upto newline character. The main trick is to avoid the expression matching the attribute a= also.

asked Dec 28 '18 at 19:38

RTC

499

@wp78de Thanks but unfortunately it does not even compile in lex. For starters, the ^ at the start means starting the line and that is precisely what I do not want. The ^ has a different interpretation in most regular expressions. In Lex it negates only in a class of characters (for example, [^a-z]) otherwise interpreted as beginning of line (just as $ signifies end of line). There are other syntactical issues.

– RTC
Dec 28 '18 at 20:27

Arguably, flex is not the ideal tool for this problem, but I don't know exactly what you are trying to do.

– rici
Dec 28 '18 at 21:22

add a comment |

     a=some value

     b=some value

The strings "a=" (b=, etc.) can be preceded by white spaces and followed by another string without any white spaces after the = and upto newline. For example:

     a=123 abcde

Here "123 abcde" is the value.
Problem is that I may encounter, at least in theory, the following

     a=123 a=

Or worse:

     a=a=

Where the first a= is the key and the second a= is now part of the value and not the key attribute. How do I distinguish the first a= token from the second?

I can match the key "a=" with the following which handles leading whitespace:

    ^[ rt]*"a="

asked Dec 28 '18 at 19:38

RTC

499

@wp78de Thanks but unfortunately it does not even compile in lex. For starters, the ^ at the start means starting the line and that is precisely what I do not want. The ^ has a different interpretation in most regular expressions. In Lex it negates only in a class of characters (for example, [^a-z]) otherwise interpreted as beginning of line (just as $ signifies end of line). There are other syntactical issues.

– RTC
Dec 28 '18 at 20:27

Arguably, flex is not the ideal tool for this problem, but I don't know exactly what you are trying to do.

– rici
Dec 28 '18 at 21:22

add a comment |

     a=some value

     b=some value

The strings "a=" (b=, etc.) can be preceded by white spaces and followed by another string without any white spaces after the = and upto newline. For example:

     a=123 abcde

Here "123 abcde" is the value.
Problem is that I may encounter, at least in theory, the following

     a=123 a=

Or worse:

     a=a=

Where the first a= is the key and the second a= is now part of the value and not the key attribute. How do I distinguish the first a= token from the second?

I can match the key "a=" with the following which handles leading whitespace:

    ^[ rt]*"a="

asked Dec 28 '18 at 19:38

RTC

499

     a=some value

     b=some value

The strings "a=" (b=, etc.) can be preceded by white spaces and followed by another string without any white spaces after the = and upto newline. For example:

     a=123 abcde

Here "123 abcde" is the value.
Problem is that I may encounter, at least in theory, the following

     a=123 a=

Or worse:

     a=a=

Where the first a= is the key and the second a= is now part of the value and not the key attribute. How do I distinguish the first a= token from the second?

I can match the key "a=" with the following which handles leading whitespace:

    ^[ rt]*"a="

regex lex

asked Dec 28 '18 at 19:38

RTC

499

asked Dec 28 '18 at 19:38

RTC

499

asked Dec 28 '18 at 19:38

RTC

499

asked Dec 28 '18 at 19:38

RTC

499

asked Dec 28 '18 at 19:38

RTC

499

@wp78de Thanks but unfortunately it does not even compile in lex. For starters, the ^ at the start means starting the line and that is precisely what I do not want. The ^ has a different interpretation in most regular expressions. In Lex it negates only in a class of characters (for example, [^a-z]) otherwise interpreted as beginning of line (just as $ signifies end of line). There are other syntactical issues.

– RTC
Dec 28 '18 at 20:27

Arguably, flex is not the ideal tool for this problem, but I don't know exactly what you are trying to do.

– rici
Dec 28 '18 at 21:22

add a comment |

@wp78de Thanks but unfortunately it does not even compile in lex. For starters, the ^ at the start means starting the line and that is precisely what I do not want. The ^ has a different interpretation in most regular expressions. In Lex it negates only in a class of characters (for example, [^a-z]) otherwise interpreted as beginning of line (just as $ signifies end of line). There are other syntactical issues.

– RTC
Dec 28 '18 at 20:27

Arguably, flex is not the ideal tool for this problem, but I don't know exactly what you are trying to do.

– rici
Dec 28 '18 at 21:22

@wp78de Thanks but unfortunately it does not even compile in lex. For starters, the ^ at the start means starting the line and that is precisely what I do not want. The ^ has a different interpretation in most regular expressions. In Lex it negates only in a class of characters (for example, [^a-z]) otherwise interpreted as beginning of line (just as $ signifies end of line). There are other syntactical issues.

– RTC
Dec 28 '18 at 20:27

Arguably, flex is not the ideal tool for this problem, but I don't know exactly what you are trying to do.

– rici
Dec 28 '18 at 21:22

add a comment |

1 Answer
1

active

oldest

votes

Use a start condition to create a different lexical context for the input after the =.

Lex works best with a language in which tokenisation is not context-dependent (most programming languages but few ad hoc interchange formats). But start conditions are manageable if you don't have too many contexts to juggle.

See the manual for details and examples.

Simple example:

%x RHS

%%

[[:space:]]+  ; /* Ignore leading white space and blank lines */

a=            { BEGIN(RHS); return TOKEN_A; }

b=            { BEGIN(RHS); return TOKEN_B; }

.*            ; /* Ignore other input. Should do something else */

<RHS>.+       { yylval = strdup(yytext); return VALUE; }

<RHS>n       { BEGIN(INITIAL); }

Note: The RHS rules send nothing if there is no value. That shouldn't be a problem for a parser but if it is, you can fix it reasonably easily.

edited Dec 28 '18 at 22:03

answered Dec 28 '18 at 21:19

rici

153k19135200

Support for start conditions opens a lot of possibilities that otherwise would be very difficult or impossible to implement. Thanks for showing the way...

– RTC
Jan 1 at 4:17

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53963534%2fwhat-is-the-regular-expression-in-lex-to-match-a-string-not-starting-with-newlin%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Use a start condition to create a different lexical context for the input after the =.

See the manual for details and examples.

Simple example:

%x RHS

%%

[[:space:]]+  ; /* Ignore leading white space and blank lines */

a=            { BEGIN(RHS); return TOKEN_A; }

b=            { BEGIN(RHS); return TOKEN_B; }

.*            ; /* Ignore other input. Should do something else */

<RHS>.+       { yylval = strdup(yytext); return VALUE; }

<RHS>n       { BEGIN(INITIAL); }

Note: The RHS rules send nothing if there is no value. That shouldn't be a problem for a parser but if it is, you can fix it reasonably easily.

edited Dec 28 '18 at 22:03

answered Dec 28 '18 at 21:19

rici

153k19135200

Support for start conditions opens a lot of possibilities that otherwise would be very difficult or impossible to implement. Thanks for showing the way...

– RTC
Jan 1 at 4:17

add a comment |

Use a start condition to create a different lexical context for the input after the =.

See the manual for details and examples.

Simple example:

%x RHS

%%

[[:space:]]+  ; /* Ignore leading white space and blank lines */

a=            { BEGIN(RHS); return TOKEN_A; }

b=            { BEGIN(RHS); return TOKEN_B; }

.*            ; /* Ignore other input. Should do something else */

<RHS>.+       { yylval = strdup(yytext); return VALUE; }

<RHS>n       { BEGIN(INITIAL); }

Note: The RHS rules send nothing if there is no value. That shouldn't be a problem for a parser but if it is, you can fix it reasonably easily.

edited Dec 28 '18 at 22:03

answered Dec 28 '18 at 21:19

rici

153k19135200

Support for start conditions opens a lot of possibilities that otherwise would be very difficult or impossible to implement. Thanks for showing the way...

– RTC
Jan 1 at 4:17

add a comment |

Use a start condition to create a different lexical context for the input after the =.

See the manual for details and examples.

Simple example:

%x RHS

%%

[[:space:]]+  ; /* Ignore leading white space and blank lines */

a=            { BEGIN(RHS); return TOKEN_A; }

b=            { BEGIN(RHS); return TOKEN_B; }

.*            ; /* Ignore other input. Should do something else */

<RHS>.+       { yylval = strdup(yytext); return VALUE; }

<RHS>n       { BEGIN(INITIAL); }

Note: The RHS rules send nothing if there is no value. That shouldn't be a problem for a parser but if it is, you can fix it reasonably easily.

edited Dec 28 '18 at 22:03

answered Dec 28 '18 at 21:19

rici

153k19135200

Use a start condition to create a different lexical context for the input after the =.

See the manual for details and examples.

Simple example:

%x RHS

%%

[[:space:]]+  ; /* Ignore leading white space and blank lines */

a=            { BEGIN(RHS); return TOKEN_A; }

b=            { BEGIN(RHS); return TOKEN_B; }

.*            ; /* Ignore other input. Should do something else */

<RHS>.+       { yylval = strdup(yytext); return VALUE; }

<RHS>n       { BEGIN(INITIAL); }

Note: The RHS rules send nothing if there is no value. That shouldn't be a problem for a parser but if it is, you can fix it reasonably easily.

edited Dec 28 '18 at 22:03

answered Dec 28 '18 at 21:19

rici

153k19135200

edited Dec 28 '18 at 22:03

answered Dec 28 '18 at 21:19

rici

153k19135200

answered Dec 28 '18 at 21:19

rici

153k19135200

answered Dec 28 '18 at 21:19

rici

153k19135200

Support for start conditions opens a lot of possibilities that otherwise would be very difficult or impossible to implement. Thanks for showing the way...

– RTC
Jan 1 at 4:17

add a comment |

Support for start conditions opens a lot of possibilities that otherwise would be very difficult or impossible to implement. Thanks for showing the way...

– RTC
Jan 1 at 4:17

Support for start conditions opens a lot of possibilities that otherwise would be very difficult or impossible to implement. Thanks for showing the way...

– RTC
Jan 1 at 4:17

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Bdtjtk