Extract substrings separately from a string using python regex

I am trying to write a regular expression which returns a part of substring which is after a string. For example: I want to get part of substring along with spaces which resides after "15/08/2017".

a='''S

LINC             SHORT LEGAL                                   TITLE NUMBER

0037 471 661     1720278;16;21                                 172 211 342



LEGAL DESCRIPTION

PLAN 1720278  

BLOCK 16  

LOT 21  

EXCEPTING THEREOUT ALL MINES AND MINERALS  



ESTATE: FEE SIMPLE  

ATS REFERENCE: 4;24;54;2;SW



MUNICIPALITY: CITY OF EDMONTON



REFERENCE NUMBER: 172 023 641 +71



---------------------------------------------------------------------------- 

----

             REGISTERED OWNER(S)

REGISTRATION    DATE(DMY)  DOCUMENT TYPE      VALUE           CONSIDERATION

--------------------------------------------------------------------------- 

-- 

---



172 211 342    15/08/2017  AFFIDAVIT OF                       CASH & MTGE'''

Is there a way to get 'AFFIDAVIT OF' and 'CASH & MTGE' as separate strings?

Here is the expression I have pieced together so far:

doc = (a.split('15/08/2017', 1)[1]).strip()

'AFFIDAVIT OF                       CASH & MTGE'

edited Dec 26 '18 at 4:16

CodeIt

67311020

asked Dec 26 '18 at 3:54

User123

2001416

I have edited with the actual input string.

– User123
Dec 21 '18 at 6:11

Okay anyway to do this using regex?

– User123
Dec 31 '18 at 4:15

Why do you want to do this with regex? Are you willing to accept any other solution?

– Mad Physicist
Dec 31 '18 at 4:29

Yes if there is a better way other than regex

– User123
Dec 31 '18 at 4:30

add a comment |

I am trying to write a regular expression which returns a part of substring which is after a string. For example: I want to get part of substring along with spaces which resides after "15/08/2017".

a='''S

LINC             SHORT LEGAL                                   TITLE NUMBER

0037 471 661     1720278;16;21                                 172 211 342



LEGAL DESCRIPTION

PLAN 1720278  

BLOCK 16  

LOT 21  

EXCEPTING THEREOUT ALL MINES AND MINERALS  



ESTATE: FEE SIMPLE  

ATS REFERENCE: 4;24;54;2;SW



MUNICIPALITY: CITY OF EDMONTON



REFERENCE NUMBER: 172 023 641 +71



---------------------------------------------------------------------------- 

----

             REGISTERED OWNER(S)

REGISTRATION    DATE(DMY)  DOCUMENT TYPE      VALUE           CONSIDERATION

--------------------------------------------------------------------------- 

-- 

---



172 211 342    15/08/2017  AFFIDAVIT OF                       CASH & MTGE'''

Is there a way to get 'AFFIDAVIT OF' and 'CASH & MTGE' as separate strings?

Here is the expression I have pieced together so far:

doc = (a.split('15/08/2017', 1)[1]).strip()

'AFFIDAVIT OF                       CASH & MTGE'

edited Dec 26 '18 at 4:16

CodeIt

67311020

asked Dec 26 '18 at 3:54

User123

2001416

I have edited with the actual input string.

– User123
Dec 21 '18 at 6:11

Okay anyway to do this using regex?

– User123
Dec 31 '18 at 4:15

Why do you want to do this with regex? Are you willing to accept any other solution?

– Mad Physicist
Dec 31 '18 at 4:29

Yes if there is a better way other than regex

– User123
Dec 31 '18 at 4:30

add a comment |

I am trying to write a regular expression which returns a part of substring which is after a string. For example: I want to get part of substring along with spaces which resides after "15/08/2017".

a='''S

LINC             SHORT LEGAL                                   TITLE NUMBER

0037 471 661     1720278;16;21                                 172 211 342



LEGAL DESCRIPTION

PLAN 1720278  

BLOCK 16  

LOT 21  

EXCEPTING THEREOUT ALL MINES AND MINERALS  



ESTATE: FEE SIMPLE  

ATS REFERENCE: 4;24;54;2;SW



MUNICIPALITY: CITY OF EDMONTON



REFERENCE NUMBER: 172 023 641 +71



---------------------------------------------------------------------------- 

----

             REGISTERED OWNER(S)

REGISTRATION    DATE(DMY)  DOCUMENT TYPE      VALUE           CONSIDERATION

--------------------------------------------------------------------------- 

-- 

---



172 211 342    15/08/2017  AFFIDAVIT OF                       CASH & MTGE'''

Is there a way to get 'AFFIDAVIT OF' and 'CASH & MTGE' as separate strings?

Here is the expression I have pieced together so far:

doc = (a.split('15/08/2017', 1)[1]).strip()

'AFFIDAVIT OF                       CASH & MTGE'

edited Dec 26 '18 at 4:16

CodeIt

67311020

asked Dec 26 '18 at 3:54

User123

2001416

I am trying to write a regular expression which returns a part of substring which is after a string. For example: I want to get part of substring along with spaces which resides after "15/08/2017".

a='''S

LINC             SHORT LEGAL                                   TITLE NUMBER

0037 471 661     1720278;16;21                                 172 211 342



LEGAL DESCRIPTION

PLAN 1720278  

BLOCK 16  

LOT 21  

EXCEPTING THEREOUT ALL MINES AND MINERALS  



ESTATE: FEE SIMPLE  

ATS REFERENCE: 4;24;54;2;SW



MUNICIPALITY: CITY OF EDMONTON



REFERENCE NUMBER: 172 023 641 +71



---------------------------------------------------------------------------- 

----

             REGISTERED OWNER(S)

REGISTRATION    DATE(DMY)  DOCUMENT TYPE      VALUE           CONSIDERATION

--------------------------------------------------------------------------- 

-- 

---



172 211 342    15/08/2017  AFFIDAVIT OF                       CASH & MTGE'''

Is there a way to get 'AFFIDAVIT OF' and 'CASH & MTGE' as separate strings?

Here is the expression I have pieced together so far:

doc = (a.split('15/08/2017', 1)[1]).strip()

'AFFIDAVIT OF                       CASH & MTGE'

python regex python-3.x

edited Dec 26 '18 at 4:16

CodeIt

67311020

asked Dec 26 '18 at 3:54

User123

2001416

edited Dec 26 '18 at 4:16

CodeIt

67311020

asked Dec 26 '18 at 3:54

User123

2001416

edited Dec 26 '18 at 4:16

CodeIt

67311020

edited Dec 26 '18 at 4:16

CodeIt

67311020

edited Dec 26 '18 at 4:16

CodeIt

67311020

asked Dec 26 '18 at 3:54

User123

2001416

asked Dec 26 '18 at 3:54

User123

2001416

asked Dec 26 '18 at 3:54

User123

2001416

I have edited with the actual input string.

– User123
Dec 21 '18 at 6:11

Okay anyway to do this using regex?

– User123
Dec 31 '18 at 4:15

Why do you want to do this with regex? Are you willing to accept any other solution?

– Mad Physicist
Dec 31 '18 at 4:29

Yes if there is a better way other than regex

– User123
Dec 31 '18 at 4:30

add a comment |

I have edited with the actual input string.

– User123
Dec 21 '18 at 6:11

Okay anyway to do this using regex?

– User123
Dec 31 '18 at 4:15

Why do you want to do this with regex? Are you willing to accept any other solution?

– Mad Physicist
Dec 31 '18 at 4:29

Yes if there is a better way other than regex

– User123
Dec 31 '18 at 4:30

I have edited with the actual input string.

– User123
Dec 21 '18 at 6:11

Okay anyway to do this using regex?

– User123
Dec 31 '18 at 4:15

Why do you want to do this with regex? Are you willing to accept any other solution?

– Mad Physicist
Dec 31 '18 at 4:29

Yes if there is a better way other than regex

– User123
Dec 31 '18 at 4:30

add a comment |

11 Answers
11

active

oldest

votes

Not a regex based solution. But does the trick.

a='''S

LINC             SHORT LEGAL                                   TITLE NUMBER

0037 471 661     1720278;16;21                                 172 211 342



LEGAL DESCRIPTION

PLAN 1720278  

BLOCK 16  

LOT 21  

EXCEPTING THEREOUT ALL MINES AND MINERALS  



ESTATE: FEE SIMPLE  

ATS REFERENCE: 4;24;54;2;SW



MUNICIPALITY: CITY OF EDMONTON



REFERENCE NUMBER: 172 023 641 +71



---------------------------------------------------------------------------- 

----

            REGISTERED OWNER(S)

REGISTRATION    DATE(DMY)  DOCUMENT TYPE      VALUE           CONSIDERATION

--------------------------------------------------------------------------- 

-- 

---



172 211 342    15/08/2017  AFFIDAVIT OF                       CASH & MTGE'''



doc = (a.split('15/08/2017', 1)[1]).strip() 

# used split with two white spaces instead of one to get the desired result

print(doc.split("  ")[0].strip()) # outputs AFFIDAVIT OF

print(doc.split("  ")[-1].strip()) # outputs CASH & MTGE

Hope it helps.

answered Dec 26 '18 at 4:00

CodeIt

67311020

See it in action here.

– CodeIt
Dec 26 '18 at 4:03

add a comment |

re based code snippet

import re

foo = '''S

LINC             SHORT LEGAL                                   TITLE NUMBER

0037 471 661     1720278;16;21                                 172 211 342



LEGAL DESCRIPTION

PLAN 1720278

BLOCK 16

LOT 21

EXCEPTING THEREOUT ALL MINES AND MINERALS



ESTATE: FEE SIMPLE

ATS REFERENCE: 4;24;54;2;SW



MUNICIPALITY: CITY OF EDMONTON



REFERENCE NUMBER: 172 023 641 +71



----------------------------------------------------------------------------

----

             REGISTERED OWNER(S)

REGISTRATION    DATE(DMY)  DOCUMENT TYPE      VALUE           CONSIDERATION

---------------------------------------------------------------------------

--

---



172 211 342    15/08/2017  AFFIDAVIT OF                       CASH & MTGE'''



pattern = '.*d{2}/d{2}/d{4}s+(w+s+w+)s+(w+s+.*s+w+)'

result = re.findall(pattern, foo, re.MULTILINE)

print "1st match: ", result[0][0]

print "2nd match: ", result[0][1]

Output

1st match:  AFFIDAVIT OF

2nd match:  CASH & MTGE

answered Dec 26 '18 at 4:19

Sharad

2,14111024

add a comment |

We can try using re.findall with the following pattern:

PHASED OF ((?!bCONDOMINIUM PLAN).)*)(?=CONDOMINIUM PLAN)

Searching in multiline and DOTALL mode, the above pattern will match everything occurring between PHASED OF until, but not including, CONDOMINIUM PLAN.

input = "182 246 612    01/10/2018  PHASED OF                           CASH & MTGEn        CONDOMINIUM PLAN"

result = re.findall(r'PHASED OF (((?!bCONDOMINIUM PLAN).)*)(?=CONDOMINIUM PLAN)', input, re.DOTALL|re.MULTILINE)

output = result[0][0].strip()

print(output)



CASH & MTGE

Note that I also strip off whitespace from the match. We might be able to modify the regex pattern to do this, but in a general solution, maybe you want to keep some of the whitespace, in certain cases.

edited Dec 31 '18 at 4:34

answered Dec 31 '18 at 4:29

Tim Biegeleisen

223k1391143

The thing is the string below DOCUMENT TYPE may be multiline and need not be necessarily a multiline. If it is multiline, it should consider it.

– User123
Dec 31 '18 at 4:36

My answer covers a multiline situation. If you see a flaw in my answer, then state exactly what it is.

– Tim Biegeleisen
Dec 31 '18 at 4:40

I cant get you what this does result = re.findall(r'PHASED OF (((?!bCONDOMINIUM PLAN).)*)(?=CONDOMINIUM PLAN)', input, re.DOTALL|re.MULTILINE). Cant we give 'PHASED OF CONDOMINIUM PLAN' as single word ?

– User123
Dec 31 '18 at 4:46

No, we can't, hence I initially commented under your question that there is no answer. You need to match across lines.

– Tim Biegeleisen
Dec 31 '18 at 4:50

Okay fine what will be the modification that needs to be done if there is no multinline word after date?

– User123
Dec 31 '18 at 4:52

add a comment |

Why regular expressions?

It looks like you know the exact delimiting string, just str.split() by it and get the first part:

In [1]: a='172 211 342    15/08/2017  TRANSFER OF LAND   $610,000        CASH & MTGE'



In [2]: a.split("15/08/2017", 1)[0]

Out[2]: '172 211 342    '

answered Dec 21 '18 at 6:05

alecxe

325k70630858

It wont work for the input string which i have edited now

– User123
Dec 21 '18 at 6:16

@Farook in this state it won't, right. You could though adjust the solution and split it on a newline first, but in that case, regex would be able to do it in one go.

– alecxe
Dec 21 '18 at 6:17

add a comment |

I would avoid using regex here, because the only meaningful separation between the logical terms appears to be 2 or more spaces. Individual terms, including the one you want to match, may also have spaces. So, I recommend doing a regex split on the input using s{2,} as the pattern. These will yield a list containing all the terms. Then, we can just walk down the list once, and when we find the forward looking term, we can return the previous term in the list.

import re

a = "172 211 342    15/08/2017  TRANSFER OF LAND   $610,000        CASH & MTGE"

parts = re.compile("s{2,}").split(a)

print(parts)



for i in range(1, len(parts)):

    if (parts[i] == "15/08/2017"):

        print(parts[i-1])



['172 211 342', '15/08/2017', 'TRANSFER OF LAND', '$610,000', 'CASH & MTGE']

172 211 342

answered Dec 21 '18 at 5:54

Tim Biegeleisen

223k1391143

add a comment |

positive lookbehind assertion**

 m=re.search('(?<=15/08/2017).*', a)

 m.group(0)

answered Dec 26 '18 at 5:10

PIG

1247

add a comment |

You have to return the right group:

re.match("(.*?)15/08/2017",a).group(1)

answered Dec 21 '18 at 5:53

RoyaumeIX

1,2491725

add a comment |

You nede to use group(1)

import re

re.match("(.*?)15/08/2017",a).group(1)

Output

'172 211 342    '

answered Dec 21 '18 at 5:54

Rishi Bansal

740217

add a comment |

Building on your expression, this is what I believe you need:

import re



a='172 211 342    15/08/2017  TRANSFER OF LAND   $610,000        CASH & MTGE'

re.match("(.*?)(w+/)",a).group(1)

Output:

'172 211 342    '

answered Dec 21 '18 at 6:08

silverhash

342110

add a comment |

You can do this by using group(1)

re.match("(.*?)15/08/2017",a).group(1)

UPDATE

For updated string you can use .search instead of .match

re.search("(.*?)15/08/2017",a).group(1)

edited Dec 21 '18 at 6:17

answered Dec 21 '18 at 5:50

Muhammad Bilal

1,73011022

This will give incorrect results if there are more than one term before 15/08/2017.

– Tim Biegeleisen
Dec 21 '18 at 5:57

I have edited my input string. It didn't work for the string which is edited now

– User123
Dec 21 '18 at 6:10

This will fail completely if the desired term is anything other than the first term.

– Tim Biegeleisen
Dec 21 '18 at 6:25

add a comment |

Your problem is that your string is formatted the way it is.
The line you are looking for is

182 246 612 01/10/2018 PHASED OF CASH & MTGE

And then you are looking for what ever comes after 'PHASED OF' and some spaces.

You want to search for

(?<=PHASED OF)s*(?P.*?)n

in your string. This will return a match object containing the value you are looking for in the group value.

m = re.search(r'(?<=PHASED OF)s*(?P<your_text>.*?)n', a)

your_desired_text = m.group('your_text')

Also: There are many good online regex testers to fiddle around with your regexes.
And only after finishing up the regex just copy and paste it into python.

I use this one: https://regex101.com/

answered Dec 31 '18 at 4:34

Kanjiu

42110

I am not searching for what ever comes after 'PHASED OF' and some spaces. Instead i am seraching for the string after the entire word below the DPCUMENT TYPE (i.e) 'PHASED OF CONDOMINIUM PLAN'

– User123
Dec 31 '18 at 4:39

"I need to get the string after the word 'PHASED OF CONDOMINIUM PLAN' which should returns 'CASH & MTGE' I have tried using the below expression". Where did i go wrong?

– Kanjiu
Dec 31 '18 at 4:43

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53927256%2fextract-substrings-separately-from-a-string-using-python-regex%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

11 Answers
11

active

oldest

votes

11 Answers
11

active

oldest

votes

Not a regex based solution. But does the trick.

a='''S

LINC             SHORT LEGAL                                   TITLE NUMBER

0037 471 661     1720278;16;21                                 172 211 342



LEGAL DESCRIPTION

PLAN 1720278  

BLOCK 16  

LOT 21  

EXCEPTING THEREOUT ALL MINES AND MINERALS  



ESTATE: FEE SIMPLE  

ATS REFERENCE: 4;24;54;2;SW



MUNICIPALITY: CITY OF EDMONTON



REFERENCE NUMBER: 172 023 641 +71



---------------------------------------------------------------------------- 

----

            REGISTERED OWNER(S)

REGISTRATION    DATE(DMY)  DOCUMENT TYPE      VALUE           CONSIDERATION

--------------------------------------------------------------------------- 

-- 

---



172 211 342    15/08/2017  AFFIDAVIT OF                       CASH & MTGE'''



doc = (a.split('15/08/2017', 1)[1]).strip() 

# used split with two white spaces instead of one to get the desired result

print(doc.split("  ")[0].strip()) # outputs AFFIDAVIT OF

print(doc.split("  ")[-1].strip()) # outputs CASH & MTGE

Hope it helps.

answered Dec 26 '18 at 4:00

CodeIt

67311020

See it in action here.

– CodeIt
Dec 26 '18 at 4:03

add a comment |

Not a regex based solution. But does the trick.

a='''S

LINC             SHORT LEGAL                                   TITLE NUMBER

0037 471 661     1720278;16;21                                 172 211 342



LEGAL DESCRIPTION

PLAN 1720278  

BLOCK 16  

LOT 21  

EXCEPTING THEREOUT ALL MINES AND MINERALS  



ESTATE: FEE SIMPLE  

ATS REFERENCE: 4;24;54;2;SW



MUNICIPALITY: CITY OF EDMONTON



REFERENCE NUMBER: 172 023 641 +71



---------------------------------------------------------------------------- 

----

            REGISTERED OWNER(S)

REGISTRATION    DATE(DMY)  DOCUMENT TYPE      VALUE           CONSIDERATION

--------------------------------------------------------------------------- 

-- 

---



172 211 342    15/08/2017  AFFIDAVIT OF                       CASH & MTGE'''



doc = (a.split('15/08/2017', 1)[1]).strip() 

# used split with two white spaces instead of one to get the desired result

print(doc.split("  ")[0].strip()) # outputs AFFIDAVIT OF

print(doc.split("  ")[-1].strip()) # outputs CASH & MTGE

Hope it helps.

answered Dec 26 '18 at 4:00

CodeIt

67311020

See it in action here.

– CodeIt
Dec 26 '18 at 4:03

add a comment |

Not a regex based solution. But does the trick.

a='''S

LINC             SHORT LEGAL                                   TITLE NUMBER

0037 471 661     1720278;16;21                                 172 211 342



LEGAL DESCRIPTION

PLAN 1720278  

BLOCK 16  

LOT 21  

EXCEPTING THEREOUT ALL MINES AND MINERALS  



ESTATE: FEE SIMPLE  

ATS REFERENCE: 4;24;54;2;SW



MUNICIPALITY: CITY OF EDMONTON



REFERENCE NUMBER: 172 023 641 +71



---------------------------------------------------------------------------- 

----

            REGISTERED OWNER(S)

REGISTRATION    DATE(DMY)  DOCUMENT TYPE      VALUE           CONSIDERATION

--------------------------------------------------------------------------- 

-- 

---



172 211 342    15/08/2017  AFFIDAVIT OF                       CASH & MTGE'''



doc = (a.split('15/08/2017', 1)[1]).strip() 

# used split with two white spaces instead of one to get the desired result

print(doc.split("  ")[0].strip()) # outputs AFFIDAVIT OF

print(doc.split("  ")[-1].strip()) # outputs CASH & MTGE

Hope it helps.

answered Dec 26 '18 at 4:00

CodeIt

67311020

Not a regex based solution. But does the trick.

a='''S

LINC             SHORT LEGAL                                   TITLE NUMBER

0037 471 661     1720278;16;21                                 172 211 342



LEGAL DESCRIPTION

PLAN 1720278  

BLOCK 16  

LOT 21  

EXCEPTING THEREOUT ALL MINES AND MINERALS  



ESTATE: FEE SIMPLE  

ATS REFERENCE: 4;24;54;2;SW



MUNICIPALITY: CITY OF EDMONTON



REFERENCE NUMBER: 172 023 641 +71



---------------------------------------------------------------------------- 

----

            REGISTERED OWNER(S)

REGISTRATION    DATE(DMY)  DOCUMENT TYPE      VALUE           CONSIDERATION

--------------------------------------------------------------------------- 

-- 

---



172 211 342    15/08/2017  AFFIDAVIT OF                       CASH & MTGE'''



doc = (a.split('15/08/2017', 1)[1]).strip() 

# used split with two white spaces instead of one to get the desired result

print(doc.split("  ")[0].strip()) # outputs AFFIDAVIT OF

print(doc.split("  ")[-1].strip()) # outputs CASH & MTGE

Hope it helps.

answered Dec 26 '18 at 4:00

CodeIt

67311020

answered Dec 26 '18 at 4:00

CodeIt

67311020

answered Dec 26 '18 at 4:00

CodeIt

67311020

answered Dec 26 '18 at 4:00

CodeIt

67311020

See it in action here.

– CodeIt
Dec 26 '18 at 4:03

add a comment |

See it in action here.

– CodeIt
Dec 26 '18 at 4:03

See it in action here.

– CodeIt
Dec 26 '18 at 4:03

add a comment |

re based code snippet

import re

foo = '''S

LINC             SHORT LEGAL                                   TITLE NUMBER

0037 471 661     1720278;16;21                                 172 211 342



LEGAL DESCRIPTION

PLAN 1720278

BLOCK 16

LOT 21

EXCEPTING THEREOUT ALL MINES AND MINERALS



ESTATE: FEE SIMPLE

ATS REFERENCE: 4;24;54;2;SW



MUNICIPALITY: CITY OF EDMONTON



REFERENCE NUMBER: 172 023 641 +71



----------------------------------------------------------------------------

----

             REGISTERED OWNER(S)

REGISTRATION    DATE(DMY)  DOCUMENT TYPE      VALUE           CONSIDERATION

---------------------------------------------------------------------------

--

---



172 211 342    15/08/2017  AFFIDAVIT OF                       CASH & MTGE'''



pattern = '.*d{2}/d{2}/d{4}s+(w+s+w+)s+(w+s+.*s+w+)'

result = re.findall(pattern, foo, re.MULTILINE)

print "1st match: ", result[0][0]

print "2nd match: ", result[0][1]

Output

1st match:  AFFIDAVIT OF

2nd match:  CASH & MTGE

answered Dec 26 '18 at 4:19

Sharad

2,14111024

add a comment |

re based code snippet

import re

foo = '''S

LINC             SHORT LEGAL                                   TITLE NUMBER

0037 471 661     1720278;16;21                                 172 211 342



LEGAL DESCRIPTION

PLAN 1720278

BLOCK 16

LOT 21

EXCEPTING THEREOUT ALL MINES AND MINERALS



ESTATE: FEE SIMPLE

ATS REFERENCE: 4;24;54;2;SW



MUNICIPALITY: CITY OF EDMONTON



REFERENCE NUMBER: 172 023 641 +71



----------------------------------------------------------------------------

----

             REGISTERED OWNER(S)

REGISTRATION    DATE(DMY)  DOCUMENT TYPE      VALUE           CONSIDERATION

---------------------------------------------------------------------------

--

---



172 211 342    15/08/2017  AFFIDAVIT OF                       CASH & MTGE'''



pattern = '.*d{2}/d{2}/d{4}s+(w+s+w+)s+(w+s+.*s+w+)'

result = re.findall(pattern, foo, re.MULTILINE)

print "1st match: ", result[0][0]

print "2nd match: ", result[0][1]

Output

1st match:  AFFIDAVIT OF

2nd match:  CASH & MTGE

answered Dec 26 '18 at 4:19

Sharad

2,14111024

add a comment |

re based code snippet

import re

foo = '''S

LINC             SHORT LEGAL                                   TITLE NUMBER

0037 471 661     1720278;16;21                                 172 211 342



LEGAL DESCRIPTION

PLAN 1720278

BLOCK 16

LOT 21

EXCEPTING THEREOUT ALL MINES AND MINERALS



ESTATE: FEE SIMPLE

ATS REFERENCE: 4;24;54;2;SW



MUNICIPALITY: CITY OF EDMONTON



REFERENCE NUMBER: 172 023 641 +71



----------------------------------------------------------------------------

----

             REGISTERED OWNER(S)

REGISTRATION    DATE(DMY)  DOCUMENT TYPE      VALUE           CONSIDERATION

---------------------------------------------------------------------------

--

---



172 211 342    15/08/2017  AFFIDAVIT OF                       CASH & MTGE'''



pattern = '.*d{2}/d{2}/d{4}s+(w+s+w+)s+(w+s+.*s+w+)'

result = re.findall(pattern, foo, re.MULTILINE)

print "1st match: ", result[0][0]

print "2nd match: ", result[0][1]

Output

1st match:  AFFIDAVIT OF

2nd match:  CASH & MTGE

answered Dec 26 '18 at 4:19

Sharad

2,14111024

re based code snippet

import re

foo = '''S

LINC             SHORT LEGAL                                   TITLE NUMBER

0037 471 661     1720278;16;21                                 172 211 342



LEGAL DESCRIPTION

PLAN 1720278

BLOCK 16

LOT 21

EXCEPTING THEREOUT ALL MINES AND MINERALS



ESTATE: FEE SIMPLE

ATS REFERENCE: 4;24;54;2;SW



MUNICIPALITY: CITY OF EDMONTON



REFERENCE NUMBER: 172 023 641 +71



----------------------------------------------------------------------------

----

             REGISTERED OWNER(S)

REGISTRATION    DATE(DMY)  DOCUMENT TYPE      VALUE           CONSIDERATION

---------------------------------------------------------------------------

--

---



172 211 342    15/08/2017  AFFIDAVIT OF                       CASH & MTGE'''



pattern = '.*d{2}/d{2}/d{4}s+(w+s+w+)s+(w+s+.*s+w+)'

result = re.findall(pattern, foo, re.MULTILINE)

print "1st match: ", result[0][0]

print "2nd match: ", result[0][1]

Output

1st match:  AFFIDAVIT OF

2nd match:  CASH & MTGE

answered Dec 26 '18 at 4:19

Sharad

2,14111024

answered Dec 26 '18 at 4:19

Sharad

2,14111024

answered Dec 26 '18 at 4:19

Sharad

2,14111024

answered Dec 26 '18 at 4:19

Sharad

2,14111024

add a comment |

We can try using re.findall with the following pattern:

PHASED OF ((?!bCONDOMINIUM PLAN).)*)(?=CONDOMINIUM PLAN)

Searching in multiline and DOTALL mode, the above pattern will match everything occurring between PHASED OF until, but not including, CONDOMINIUM PLAN.

input = "182 246 612    01/10/2018  PHASED OF                           CASH & MTGEn        CONDOMINIUM PLAN"

result = re.findall(r'PHASED OF (((?!bCONDOMINIUM PLAN).)*)(?=CONDOMINIUM PLAN)', input, re.DOTALL|re.MULTILINE)

output = result[0][0].strip()

print(output)



CASH & MTGE

edited Dec 31 '18 at 4:34

answered Dec 31 '18 at 4:29

Tim Biegeleisen

223k1391143

The thing is the string below DOCUMENT TYPE may be multiline and need not be necessarily a multiline. If it is multiline, it should consider it.

– User123
Dec 31 '18 at 4:36

My answer covers a multiline situation. If you see a flaw in my answer, then state exactly what it is.

– Tim Biegeleisen
Dec 31 '18 at 4:40

I cant get you what this does result = re.findall(r'PHASED OF (((?!bCONDOMINIUM PLAN).)*)(?=CONDOMINIUM PLAN)', input, re.DOTALL|re.MULTILINE). Cant we give 'PHASED OF CONDOMINIUM PLAN' as single word ?

– User123
Dec 31 '18 at 4:46

No, we can't, hence I initially commented under your question that there is no answer. You need to match across lines.

– Tim Biegeleisen
Dec 31 '18 at 4:50

Okay fine what will be the modification that needs to be done if there is no multinline word after date?

– User123
Dec 31 '18 at 4:52

add a comment |

We can try using re.findall with the following pattern:

PHASED OF ((?!bCONDOMINIUM PLAN).)*)(?=CONDOMINIUM PLAN)

Searching in multiline and DOTALL mode, the above pattern will match everything occurring between PHASED OF until, but not including, CONDOMINIUM PLAN.

input = "182 246 612    01/10/2018  PHASED OF                           CASH & MTGEn        CONDOMINIUM PLAN"

result = re.findall(r'PHASED OF (((?!bCONDOMINIUM PLAN).)*)(?=CONDOMINIUM PLAN)', input, re.DOTALL|re.MULTILINE)

output = result[0][0].strip()

print(output)



CASH & MTGE

edited Dec 31 '18 at 4:34

answered Dec 31 '18 at 4:29

Tim Biegeleisen

223k1391143

The thing is the string below DOCUMENT TYPE may be multiline and need not be necessarily a multiline. If it is multiline, it should consider it.

– User123
Dec 31 '18 at 4:36

My answer covers a multiline situation. If you see a flaw in my answer, then state exactly what it is.

– Tim Biegeleisen
Dec 31 '18 at 4:40

I cant get you what this does result = re.findall(r'PHASED OF (((?!bCONDOMINIUM PLAN).)*)(?=CONDOMINIUM PLAN)', input, re.DOTALL|re.MULTILINE). Cant we give 'PHASED OF CONDOMINIUM PLAN' as single word ?

– User123
Dec 31 '18 at 4:46

No, we can't, hence I initially commented under your question that there is no answer. You need to match across lines.

– Tim Biegeleisen
Dec 31 '18 at 4:50

Okay fine what will be the modification that needs to be done if there is no multinline word after date?

– User123
Dec 31 '18 at 4:52

add a comment |

We can try using re.findall with the following pattern:

PHASED OF ((?!bCONDOMINIUM PLAN).)*)(?=CONDOMINIUM PLAN)

Searching in multiline and DOTALL mode, the above pattern will match everything occurring between PHASED OF until, but not including, CONDOMINIUM PLAN.

input = "182 246 612    01/10/2018  PHASED OF                           CASH & MTGEn        CONDOMINIUM PLAN"

result = re.findall(r'PHASED OF (((?!bCONDOMINIUM PLAN).)*)(?=CONDOMINIUM PLAN)', input, re.DOTALL|re.MULTILINE)

output = result[0][0].strip()

print(output)



CASH & MTGE

edited Dec 31 '18 at 4:34

answered Dec 31 '18 at 4:29

Tim Biegeleisen

223k1391143

We can try using re.findall with the following pattern:

PHASED OF ((?!bCONDOMINIUM PLAN).)*)(?=CONDOMINIUM PLAN)

Searching in multiline and DOTALL mode, the above pattern will match everything occurring between PHASED OF until, but not including, CONDOMINIUM PLAN.

input = "182 246 612    01/10/2018  PHASED OF                           CASH & MTGEn        CONDOMINIUM PLAN"

result = re.findall(r'PHASED OF (((?!bCONDOMINIUM PLAN).)*)(?=CONDOMINIUM PLAN)', input, re.DOTALL|re.MULTILINE)

output = result[0][0].strip()

print(output)



CASH & MTGE

edited Dec 31 '18 at 4:34

answered Dec 31 '18 at 4:29

Tim Biegeleisen

223k1391143

edited Dec 31 '18 at 4:34

answered Dec 31 '18 at 4:29

Tim Biegeleisen

223k1391143

answered Dec 31 '18 at 4:29

Tim Biegeleisen

223k1391143

answered Dec 31 '18 at 4:29

Tim Biegeleisen

223k1391143

The thing is the string below DOCUMENT TYPE may be multiline and need not be necessarily a multiline. If it is multiline, it should consider it.

– User123
Dec 31 '18 at 4:36

My answer covers a multiline situation. If you see a flaw in my answer, then state exactly what it is.

– Tim Biegeleisen
Dec 31 '18 at 4:40

I cant get you what this does result = re.findall(r'PHASED OF (((?!bCONDOMINIUM PLAN).)*)(?=CONDOMINIUM PLAN)', input, re.DOTALL|re.MULTILINE). Cant we give 'PHASED OF CONDOMINIUM PLAN' as single word ?

– User123
Dec 31 '18 at 4:46

No, we can't, hence I initially commented under your question that there is no answer. You need to match across lines.

– Tim Biegeleisen
Dec 31 '18 at 4:50

Okay fine what will be the modification that needs to be done if there is no multinline word after date?

– User123
Dec 31 '18 at 4:52

add a comment |

The thing is the string below DOCUMENT TYPE may be multiline and need not be necessarily a multiline. If it is multiline, it should consider it.

– User123
Dec 31 '18 at 4:36

My answer covers a multiline situation. If you see a flaw in my answer, then state exactly what it is.

– Tim Biegeleisen
Dec 31 '18 at 4:40

I cant get you what this does result = re.findall(r'PHASED OF (((?!bCONDOMINIUM PLAN).)*)(?=CONDOMINIUM PLAN)', input, re.DOTALL|re.MULTILINE). Cant we give 'PHASED OF CONDOMINIUM PLAN' as single word ?

– User123
Dec 31 '18 at 4:46

No, we can't, hence I initially commented under your question that there is no answer. You need to match across lines.

– Tim Biegeleisen
Dec 31 '18 at 4:50

Okay fine what will be the modification that needs to be done if there is no multinline word after date?

– User123
Dec 31 '18 at 4:52

The thing is the string below DOCUMENT TYPE may be multiline and need not be necessarily a multiline. If it is multiline, it should consider it.

– User123
Dec 31 '18 at 4:36

My answer covers a multiline situation. If you see a flaw in my answer, then state exactly what it is.

– Tim Biegeleisen
Dec 31 '18 at 4:40

I cant get you what this does result = re.findall(r'PHASED OF (((?!bCONDOMINIUM PLAN).)*)(?=CONDOMINIUM PLAN)', input, re.DOTALL|re.MULTILINE). Cant we give 'PHASED OF CONDOMINIUM PLAN' as single word ?

– User123
Dec 31 '18 at 4:46

No, we can't, hence I initially commented under your question that there is no answer. You need to match across lines.

– Tim Biegeleisen
Dec 31 '18 at 4:50

Okay fine what will be the modification that needs to be done if there is no multinline word after date?

– User123
Dec 31 '18 at 4:52

add a comment |

Why regular expressions?

It looks like you know the exact delimiting string, just str.split() by it and get the first part:

In [1]: a='172 211 342    15/08/2017  TRANSFER OF LAND   $610,000        CASH & MTGE'



In [2]: a.split("15/08/2017", 1)[0]

Out[2]: '172 211 342    '

answered Dec 21 '18 at 6:05

alecxe

325k70630858

It wont work for the input string which i have edited now

– User123
Dec 21 '18 at 6:16

@Farook in this state it won't, right. You could though adjust the solution and split it on a newline first, but in that case, regex would be able to do it in one go.

– alecxe
Dec 21 '18 at 6:17

add a comment |

Why regular expressions?

It looks like you know the exact delimiting string, just str.split() by it and get the first part:

In [1]: a='172 211 342    15/08/2017  TRANSFER OF LAND   $610,000        CASH & MTGE'



In [2]: a.split("15/08/2017", 1)[0]

Out[2]: '172 211 342    '

answered Dec 21 '18 at 6:05

alecxe

325k70630858

It wont work for the input string which i have edited now

– User123
Dec 21 '18 at 6:16

@Farook in this state it won't, right. You could though adjust the solution and split it on a newline first, but in that case, regex would be able to do it in one go.

– alecxe
Dec 21 '18 at 6:17

add a comment |

Why regular expressions?

It looks like you know the exact delimiting string, just str.split() by it and get the first part:

In [1]: a='172 211 342    15/08/2017  TRANSFER OF LAND   $610,000        CASH & MTGE'



In [2]: a.split("15/08/2017", 1)[0]

Out[2]: '172 211 342    '

answered Dec 21 '18 at 6:05

alecxe

325k70630858

Why regular expressions?

It looks like you know the exact delimiting string, just str.split() by it and get the first part:

In [1]: a='172 211 342    15/08/2017  TRANSFER OF LAND   $610,000        CASH & MTGE'



In [2]: a.split("15/08/2017", 1)[0]

Out[2]: '172 211 342    '

answered Dec 21 '18 at 6:05

alecxe

325k70630858

answered Dec 21 '18 at 6:05

alecxe

325k70630858

answered Dec 21 '18 at 6:05

alecxe

325k70630858

answered Dec 21 '18 at 6:05

alecxe

325k70630858

It wont work for the input string which i have edited now

– User123
Dec 21 '18 at 6:16

@Farook in this state it won't, right. You could though adjust the solution and split it on a newline first, but in that case, regex would be able to do it in one go.

– alecxe
Dec 21 '18 at 6:17

add a comment |

It wont work for the input string which i have edited now

– User123
Dec 21 '18 at 6:16

@Farook in this state it won't, right. You could though adjust the solution and split it on a newline first, but in that case, regex would be able to do it in one go.

– alecxe
Dec 21 '18 at 6:17

It wont work for the input string which i have edited now

– User123
Dec 21 '18 at 6:16

@Farook in this state it won't, right. You could though adjust the solution and split it on a newline first, but in that case, regex would be able to do it in one go.

– alecxe
Dec 21 '18 at 6:17

add a comment |

import re

a = "172 211 342    15/08/2017  TRANSFER OF LAND   $610,000        CASH & MTGE"

parts = re.compile("s{2,}").split(a)

print(parts)



for i in range(1, len(parts)):

    if (parts[i] == "15/08/2017"):

        print(parts[i-1])



['172 211 342', '15/08/2017', 'TRANSFER OF LAND', '$610,000', 'CASH & MTGE']

172 211 342

answered Dec 21 '18 at 5:54

Tim Biegeleisen

223k1391143

add a comment |

import re

a = "172 211 342    15/08/2017  TRANSFER OF LAND   $610,000        CASH & MTGE"

parts = re.compile("s{2,}").split(a)

print(parts)



for i in range(1, len(parts)):

    if (parts[i] == "15/08/2017"):

        print(parts[i-1])



['172 211 342', '15/08/2017', 'TRANSFER OF LAND', '$610,000', 'CASH & MTGE']

172 211 342

answered Dec 21 '18 at 5:54

Tim Biegeleisen

223k1391143

add a comment |

import re

a = "172 211 342    15/08/2017  TRANSFER OF LAND   $610,000        CASH & MTGE"

parts = re.compile("s{2,}").split(a)

print(parts)



for i in range(1, len(parts)):

    if (parts[i] == "15/08/2017"):

        print(parts[i-1])



['172 211 342', '15/08/2017', 'TRANSFER OF LAND', '$610,000', 'CASH & MTGE']

172 211 342

answered Dec 21 '18 at 5:54

Tim Biegeleisen

223k1391143

import re

a = "172 211 342    15/08/2017  TRANSFER OF LAND   $610,000        CASH & MTGE"

parts = re.compile("s{2,}").split(a)

print(parts)



for i in range(1, len(parts)):

    if (parts[i] == "15/08/2017"):

        print(parts[i-1])



['172 211 342', '15/08/2017', 'TRANSFER OF LAND', '$610,000', 'CASH & MTGE']

172 211 342

answered Dec 21 '18 at 5:54

Tim Biegeleisen

223k1391143

answered Dec 21 '18 at 5:54

Tim Biegeleisen

223k1391143

answered Dec 21 '18 at 5:54

Tim Biegeleisen

223k1391143

answered Dec 21 '18 at 5:54

Tim Biegeleisen

223k1391143

add a comment |

positive lookbehind assertion**

 m=re.search('(?<=15/08/2017).*', a)

 m.group(0)

answered Dec 26 '18 at 5:10

PIG

1247

add a comment |

positive lookbehind assertion**

 m=re.search('(?<=15/08/2017).*', a)

 m.group(0)

answered Dec 26 '18 at 5:10

PIG

1247

add a comment |

positive lookbehind assertion**

 m=re.search('(?<=15/08/2017).*', a)

 m.group(0)

answered Dec 26 '18 at 5:10

PIG

1247

positive lookbehind assertion**

 m=re.search('(?<=15/08/2017).*', a)

 m.group(0)

answered Dec 26 '18 at 5:10

PIG

1247

answered Dec 26 '18 at 5:10

PIG

1247

answered Dec 26 '18 at 5:10

PIG

1247

answered Dec 26 '18 at 5:10

PIG

1247

add a comment |

You have to return the right group:

re.match("(.*?)15/08/2017",a).group(1)

answered Dec 21 '18 at 5:53

RoyaumeIX

1,2491725

add a comment |

You have to return the right group:

re.match("(.*?)15/08/2017",a).group(1)

answered Dec 21 '18 at 5:53

RoyaumeIX

1,2491725

add a comment |

You have to return the right group:

re.match("(.*?)15/08/2017",a).group(1)

answered Dec 21 '18 at 5:53

RoyaumeIX

1,2491725

You have to return the right group:

re.match("(.*?)15/08/2017",a).group(1)

answered Dec 21 '18 at 5:53

RoyaumeIX

1,2491725

answered Dec 21 '18 at 5:53

RoyaumeIX

1,2491725

answered Dec 21 '18 at 5:53

RoyaumeIX

1,2491725

answered Dec 21 '18 at 5:53

RoyaumeIX

1,2491725

add a comment |

You nede to use group(1)

import re

re.match("(.*?)15/08/2017",a).group(1)

Output

'172 211 342    '

answered Dec 21 '18 at 5:54

Rishi Bansal

740217

add a comment |

You nede to use group(1)

import re

re.match("(.*?)15/08/2017",a).group(1)

Output

'172 211 342    '

answered Dec 21 '18 at 5:54

Rishi Bansal

740217

add a comment |

You nede to use group(1)

import re

re.match("(.*?)15/08/2017",a).group(1)

Output

'172 211 342    '

answered Dec 21 '18 at 5:54

Rishi Bansal

740217

You nede to use group(1)

import re

re.match("(.*?)15/08/2017",a).group(1)

Output

'172 211 342    '

answered Dec 21 '18 at 5:54

Rishi Bansal

740217

answered Dec 21 '18 at 5:54

Rishi Bansal

740217

answered Dec 21 '18 at 5:54

Rishi Bansal

740217

answered Dec 21 '18 at 5:54

Rishi Bansal

740217

add a comment |

Building on your expression, this is what I believe you need:

import re



a='172 211 342    15/08/2017  TRANSFER OF LAND   $610,000        CASH & MTGE'

re.match("(.*?)(w+/)",a).group(1)

Output:

'172 211 342    '

answered Dec 21 '18 at 6:08

silverhash

342110

add a comment |

Building on your expression, this is what I believe you need:

import re



a='172 211 342    15/08/2017  TRANSFER OF LAND   $610,000        CASH & MTGE'

re.match("(.*?)(w+/)",a).group(1)

Output:

'172 211 342    '

answered Dec 21 '18 at 6:08

silverhash

342110

add a comment |

Building on your expression, this is what I believe you need:

import re



a='172 211 342    15/08/2017  TRANSFER OF LAND   $610,000        CASH & MTGE'

re.match("(.*?)(w+/)",a).group(1)

Output:

'172 211 342    '

answered Dec 21 '18 at 6:08

silverhash

342110

Building on your expression, this is what I believe you need:

import re



a='172 211 342    15/08/2017  TRANSFER OF LAND   $610,000        CASH & MTGE'

re.match("(.*?)(w+/)",a).group(1)

Output:

'172 211 342    '

answered Dec 21 '18 at 6:08

silverhash

342110

answered Dec 21 '18 at 6:08

silverhash

342110

answered Dec 21 '18 at 6:08

silverhash

342110

answered Dec 21 '18 at 6:08

silverhash

342110

add a comment |

You can do this by using group(1)

re.match("(.*?)15/08/2017",a).group(1)

UPDATE

For updated string you can use .search instead of .match

re.search("(.*?)15/08/2017",a).group(1)

edited Dec 21 '18 at 6:17

answered Dec 21 '18 at 5:50

Muhammad Bilal

1,73011022

This will give incorrect results if there are more than one term before 15/08/2017.

– Tim Biegeleisen
Dec 21 '18 at 5:57

I have edited my input string. It didn't work for the string which is edited now

– User123
Dec 21 '18 at 6:10

This will fail completely if the desired term is anything other than the first term.

– Tim Biegeleisen
Dec 21 '18 at 6:25

add a comment |

You can do this by using group(1)

re.match("(.*?)15/08/2017",a).group(1)

UPDATE

For updated string you can use .search instead of .match

re.search("(.*?)15/08/2017",a).group(1)

edited Dec 21 '18 at 6:17

answered Dec 21 '18 at 5:50

Muhammad Bilal

1,73011022

This will give incorrect results if there are more than one term before 15/08/2017.

– Tim Biegeleisen
Dec 21 '18 at 5:57

I have edited my input string. It didn't work for the string which is edited now

– User123
Dec 21 '18 at 6:10

This will fail completely if the desired term is anything other than the first term.

– Tim Biegeleisen
Dec 21 '18 at 6:25

add a comment |

You can do this by using group(1)

re.match("(.*?)15/08/2017",a).group(1)

UPDATE

For updated string you can use .search instead of .match

re.search("(.*?)15/08/2017",a).group(1)

edited Dec 21 '18 at 6:17

answered Dec 21 '18 at 5:50

Muhammad Bilal

1,73011022

You can do this by using group(1)

re.match("(.*?)15/08/2017",a).group(1)

UPDATE

For updated string you can use .search instead of .match

re.search("(.*?)15/08/2017",a).group(1)

edited Dec 21 '18 at 6:17

answered Dec 21 '18 at 5:50

Muhammad Bilal

1,73011022

edited Dec 21 '18 at 6:17

answered Dec 21 '18 at 5:50

Muhammad Bilal

1,73011022

answered Dec 21 '18 at 5:50

Muhammad Bilal

1,73011022

answered Dec 21 '18 at 5:50

Muhammad Bilal

1,73011022

This will give incorrect results if there are more than one term before 15/08/2017.

– Tim Biegeleisen
Dec 21 '18 at 5:57

I have edited my input string. It didn't work for the string which is edited now

– User123
Dec 21 '18 at 6:10

This will fail completely if the desired term is anything other than the first term.

– Tim Biegeleisen
Dec 21 '18 at 6:25

add a comment |

This will give incorrect results if there are more than one term before 15/08/2017.

– Tim Biegeleisen
Dec 21 '18 at 5:57

I have edited my input string. It didn't work for the string which is edited now

– User123
Dec 21 '18 at 6:10

This will fail completely if the desired term is anything other than the first term.

– Tim Biegeleisen
Dec 21 '18 at 6:25

This will give incorrect results if there are more than one term before 15/08/2017.

– Tim Biegeleisen
Dec 21 '18 at 5:57

I have edited my input string. It didn't work for the string which is edited now

– User123
Dec 21 '18 at 6:10

This will fail completely if the desired term is anything other than the first term.

– Tim Biegeleisen
Dec 21 '18 at 6:25

add a comment |

Your problem is that your string is formatted the way it is.
The line you are looking for is

182 246 612 01/10/2018 PHASED OF CASH & MTGE

And then you are looking for what ever comes after 'PHASED OF' and some spaces.

You want to search for

(?<=PHASED OF)s*(?P.*?)n

in your string. This will return a match object containing the value you are looking for in the group value.

m = re.search(r'(?<=PHASED OF)s*(?P<your_text>.*?)n', a)

your_desired_text = m.group('your_text')

Also: There are many good online regex testers to fiddle around with your regexes.
And only after finishing up the regex just copy and paste it into python.

I use this one: https://regex101.com/

answered Dec 31 '18 at 4:34

Kanjiu

42110

I am not searching for what ever comes after 'PHASED OF' and some spaces. Instead i am seraching for the string after the entire word below the DPCUMENT TYPE (i.e) 'PHASED OF CONDOMINIUM PLAN'

– User123
Dec 31 '18 at 4:39

"I need to get the string after the word 'PHASED OF CONDOMINIUM PLAN' which should returns 'CASH & MTGE' I have tried using the below expression". Where did i go wrong?

– Kanjiu
Dec 31 '18 at 4:43

add a comment |

Your problem is that your string is formatted the way it is.
The line you are looking for is

182 246 612 01/10/2018 PHASED OF CASH & MTGE

And then you are looking for what ever comes after 'PHASED OF' and some spaces.

You want to search for

(?<=PHASED OF)s*(?P.*?)n

in your string. This will return a match object containing the value you are looking for in the group value.

m = re.search(r'(?<=PHASED OF)s*(?P<your_text>.*?)n', a)

your_desired_text = m.group('your_text')

Also: There are many good online regex testers to fiddle around with your regexes.
And only after finishing up the regex just copy and paste it into python.

I use this one: https://regex101.com/

answered Dec 31 '18 at 4:34

Kanjiu

42110

I am not searching for what ever comes after 'PHASED OF' and some spaces. Instead i am seraching for the string after the entire word below the DPCUMENT TYPE (i.e) 'PHASED OF CONDOMINIUM PLAN'

– User123
Dec 31 '18 at 4:39

"I need to get the string after the word 'PHASED OF CONDOMINIUM PLAN' which should returns 'CASH & MTGE' I have tried using the below expression". Where did i go wrong?

– Kanjiu
Dec 31 '18 at 4:43

add a comment |

Your problem is that your string is formatted the way it is.
The line you are looking for is

182 246 612 01/10/2018 PHASED OF CASH & MTGE

And then you are looking for what ever comes after 'PHASED OF' and some spaces.

You want to search for

(?<=PHASED OF)s*(?P.*?)n

in your string. This will return a match object containing the value you are looking for in the group value.

m = re.search(r'(?<=PHASED OF)s*(?P<your_text>.*?)n', a)

your_desired_text = m.group('your_text')

Also: There are many good online regex testers to fiddle around with your regexes.
And only after finishing up the regex just copy and paste it into python.

I use this one: https://regex101.com/

answered Dec 31 '18 at 4:34

Kanjiu

42110

Your problem is that your string is formatted the way it is.
The line you are looking for is

182 246 612 01/10/2018 PHASED OF CASH & MTGE

And then you are looking for what ever comes after 'PHASED OF' and some spaces.

You want to search for

(?<=PHASED OF)s*(?P.*?)n

in your string. This will return a match object containing the value you are looking for in the group value.

m = re.search(r'(?<=PHASED OF)s*(?P<your_text>.*?)n', a)

your_desired_text = m.group('your_text')

Also: There are many good online regex testers to fiddle around with your regexes.
And only after finishing up the regex just copy and paste it into python.

I use this one: https://regex101.com/

answered Dec 31 '18 at 4:34

Kanjiu

42110

answered Dec 31 '18 at 4:34

Kanjiu

42110

answered Dec 31 '18 at 4:34

Kanjiu

42110

answered Dec 31 '18 at 4:34

Kanjiu

42110

I am not searching for what ever comes after 'PHASED OF' and some spaces. Instead i am seraching for the string after the entire word below the DPCUMENT TYPE (i.e) 'PHASED OF CONDOMINIUM PLAN'

– User123
Dec 31 '18 at 4:39

"I need to get the string after the word 'PHASED OF CONDOMINIUM PLAN' which should returns 'CASH & MTGE' I have tried using the below expression". Where did i go wrong?

– Kanjiu
Dec 31 '18 at 4:43

add a comment |

I am not searching for what ever comes after 'PHASED OF' and some spaces. Instead i am seraching for the string after the entire word below the DPCUMENT TYPE (i.e) 'PHASED OF CONDOMINIUM PLAN'

– User123
Dec 31 '18 at 4:39

"I need to get the string after the word 'PHASED OF CONDOMINIUM PLAN' which should returns 'CASH & MTGE' I have tried using the below expression". Where did i go wrong?

– Kanjiu
Dec 31 '18 at 4:43

I am not searching for what ever comes after 'PHASED OF' and some spaces. Instead i am seraching for the string after the entire word below the DPCUMENT TYPE (i.e) 'PHASED OF CONDOMINIUM PLAN'

– User123
Dec 31 '18 at 4:39

"I need to get the string after the word 'PHASED OF CONDOMINIUM PLAN' which should returns 'CASH & MTGE' I have tried using the below expression". Where did i go wrong?

– Kanjiu
Dec 31 '18 at 4:43

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

YLpFXCxj0Y92Ix9CYczsAV4rFyv6EEwq0TDESN4S3Z6 u,7dkDS Unn6klHG49NxL

搜尋此網誌

Bdtjtk