How can I extract tuples from a string?
I have the following string:
r"(A1,B1,C1,D1),(A2,B2,C2,D2),..."
and I want to extract a list of tuples
[(A1,B1,C1,D1),(A2,B2,C2,D2),...]
A
, B
and D
are integers, while C
is a string enclosed in single quotes. The hard part is given by the fact that C
might contain any character, included escaped single quotes ('
), commas (,
), escaped backslashes (\
) and integers. I am trying to solve this problem using regexes, but I can't figure out how to do it.
So far, I've tried to match the end of the string by looking at the first single quote which is preceded by an even number of backslashes (0, 2, 4, ...), but I can't make it working. Any idea?
Expected results:
r"(21,3,'abc',57',1993)"
-->(21,3,'abc',57',1993)
r"(21,3,'abc\',1993)"
-->(21,3,'abc\',1993)
r"(21,3,'abc\\',57\\',1993)"
-->(21,3,'abc\\',57\\',1993)
python regex
add a comment |
I have the following string:
r"(A1,B1,C1,D1),(A2,B2,C2,D2),..."
and I want to extract a list of tuples
[(A1,B1,C1,D1),(A2,B2,C2,D2),...]
A
, B
and D
are integers, while C
is a string enclosed in single quotes. The hard part is given by the fact that C
might contain any character, included escaped single quotes ('
), commas (,
), escaped backslashes (\
) and integers. I am trying to solve this problem using regexes, but I can't figure out how to do it.
So far, I've tried to match the end of the string by looking at the first single quote which is preceded by an even number of backslashes (0, 2, 4, ...), but I can't make it working. Any idea?
Expected results:
r"(21,3,'abc',57',1993)"
-->(21,3,'abc',57',1993)
r"(21,3,'abc\',1993)"
-->(21,3,'abc\',1993)
r"(21,3,'abc\\',57\\',1993)"
-->(21,3,'abc\\',57\\',1993)
python regex
1
You should look intoast.literal_eval
.
– Scott Hunter
Dec 31 '18 at 0:00
Tryre.findall(r"""((d+),(d+),('[^'\]*(?:\.[^'\]*)*'),(d+))""", s)
, see regex101.com/r/3DMXyZ/1 and ideone.com/DlP6we
– Wiktor Stribiżew
Dec 31 '18 at 0:09
What is the source of these strings?
– juanpa.arrivillaga
Dec 31 '18 at 2:24
add a comment |
I have the following string:
r"(A1,B1,C1,D1),(A2,B2,C2,D2),..."
and I want to extract a list of tuples
[(A1,B1,C1,D1),(A2,B2,C2,D2),...]
A
, B
and D
are integers, while C
is a string enclosed in single quotes. The hard part is given by the fact that C
might contain any character, included escaped single quotes ('
), commas (,
), escaped backslashes (\
) and integers. I am trying to solve this problem using regexes, but I can't figure out how to do it.
So far, I've tried to match the end of the string by looking at the first single quote which is preceded by an even number of backslashes (0, 2, 4, ...), but I can't make it working. Any idea?
Expected results:
r"(21,3,'abc',57',1993)"
-->(21,3,'abc',57',1993)
r"(21,3,'abc\',1993)"
-->(21,3,'abc\',1993)
r"(21,3,'abc\\',57\\',1993)"
-->(21,3,'abc\\',57\\',1993)
python regex
I have the following string:
r"(A1,B1,C1,D1),(A2,B2,C2,D2),..."
and I want to extract a list of tuples
[(A1,B1,C1,D1),(A2,B2,C2,D2),...]
A
, B
and D
are integers, while C
is a string enclosed in single quotes. The hard part is given by the fact that C
might contain any character, included escaped single quotes ('
), commas (,
), escaped backslashes (\
) and integers. I am trying to solve this problem using regexes, but I can't figure out how to do it.
So far, I've tried to match the end of the string by looking at the first single quote which is preceded by an even number of backslashes (0, 2, 4, ...), but I can't make it working. Any idea?
Expected results:
r"(21,3,'abc',57',1993)"
-->(21,3,'abc',57',1993)
r"(21,3,'abc\',1993)"
-->(21,3,'abc\',1993)
r"(21,3,'abc\\',57\\',1993)"
-->(21,3,'abc\\',57\\',1993)
python regex
python regex
asked Dec 30 '18 at 23:52
Riccardo BuccoRiccardo Bucco
1267
1267
1
You should look intoast.literal_eval
.
– Scott Hunter
Dec 31 '18 at 0:00
Tryre.findall(r"""((d+),(d+),('[^'\]*(?:\.[^'\]*)*'),(d+))""", s)
, see regex101.com/r/3DMXyZ/1 and ideone.com/DlP6we
– Wiktor Stribiżew
Dec 31 '18 at 0:09
What is the source of these strings?
– juanpa.arrivillaga
Dec 31 '18 at 2:24
add a comment |
1
You should look intoast.literal_eval
.
– Scott Hunter
Dec 31 '18 at 0:00
Tryre.findall(r"""((d+),(d+),('[^'\]*(?:\.[^'\]*)*'),(d+))""", s)
, see regex101.com/r/3DMXyZ/1 and ideone.com/DlP6we
– Wiktor Stribiżew
Dec 31 '18 at 0:09
What is the source of these strings?
– juanpa.arrivillaga
Dec 31 '18 at 2:24
1
1
You should look into
ast.literal_eval
.– Scott Hunter
Dec 31 '18 at 0:00
You should look into
ast.literal_eval
.– Scott Hunter
Dec 31 '18 at 0:00
Try
re.findall(r"""((d+),(d+),('[^'\]*(?:\.[^'\]*)*'),(d+))""", s)
, see regex101.com/r/3DMXyZ/1 and ideone.com/DlP6we– Wiktor Stribiżew
Dec 31 '18 at 0:09
Try
re.findall(r"""((d+),(d+),('[^'\]*(?:\.[^'\]*)*'),(d+))""", s)
, see regex101.com/r/3DMXyZ/1 and ideone.com/DlP6we– Wiktor Stribiżew
Dec 31 '18 at 0:09
What is the source of these strings?
– juanpa.arrivillaga
Dec 31 '18 at 2:24
What is the source of these strings?
– juanpa.arrivillaga
Dec 31 '18 at 2:24
add a comment |
2 Answers
2
active
oldest
votes
You can use ast.literal_eval
to evaluate string containing python literals,
import ast
ip = r"(21,3,'abc',57',1993)"
op = ast.literal_eval(ip)
print(op)
# output,
# (21, 3, "abc',57", 1993)
# verify that they are correct types,
for i in op:
print("{} is {}".format(i, type(i)))
# output,
# 21 is <class 'int'>
# 3 is <class 'int'>
# abc',57 is <class 'str'>
# 1993 is <class 'int'>
add a comment |
You can use the pattern
(?<=')(?:\\|\'|[^'])+(?=',)|d+
For the string content (looks ahead and behind for '
s), it'll repeat a group composed of either:
\\
- two backslashes (that is, represents a single literal backslash)
\'
- an escaped'
(that is, represents a single literal'
)
[^']
- Anything but a quote character
Or, it'll match d+
, the integers.
https://regex101.com/r/5beqXJ/1
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53982397%2fhow-can-i-extract-tuples-from-a-string%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can use ast.literal_eval
to evaluate string containing python literals,
import ast
ip = r"(21,3,'abc',57',1993)"
op = ast.literal_eval(ip)
print(op)
# output,
# (21, 3, "abc',57", 1993)
# verify that they are correct types,
for i in op:
print("{} is {}".format(i, type(i)))
# output,
# 21 is <class 'int'>
# 3 is <class 'int'>
# abc',57 is <class 'str'>
# 1993 is <class 'int'>
add a comment |
You can use ast.literal_eval
to evaluate string containing python literals,
import ast
ip = r"(21,3,'abc',57',1993)"
op = ast.literal_eval(ip)
print(op)
# output,
# (21, 3, "abc',57", 1993)
# verify that they are correct types,
for i in op:
print("{} is {}".format(i, type(i)))
# output,
# 21 is <class 'int'>
# 3 is <class 'int'>
# abc',57 is <class 'str'>
# 1993 is <class 'int'>
add a comment |
You can use ast.literal_eval
to evaluate string containing python literals,
import ast
ip = r"(21,3,'abc',57',1993)"
op = ast.literal_eval(ip)
print(op)
# output,
# (21, 3, "abc',57", 1993)
# verify that they are correct types,
for i in op:
print("{} is {}".format(i, type(i)))
# output,
# 21 is <class 'int'>
# 3 is <class 'int'>
# abc',57 is <class 'str'>
# 1993 is <class 'int'>
You can use ast.literal_eval
to evaluate string containing python literals,
import ast
ip = r"(21,3,'abc',57',1993)"
op = ast.literal_eval(ip)
print(op)
# output,
# (21, 3, "abc',57", 1993)
# verify that they are correct types,
for i in op:
print("{} is {}".format(i, type(i)))
# output,
# 21 is <class 'int'>
# 3 is <class 'int'>
# abc',57 is <class 'str'>
# 1993 is <class 'int'>
edited Dec 31 '18 at 0:12
answered Dec 31 '18 at 0:09
Sufiyan GhoriSufiyan Ghori
11.4k95781
11.4k95781
add a comment |
add a comment |
You can use the pattern
(?<=')(?:\\|\'|[^'])+(?=',)|d+
For the string content (looks ahead and behind for '
s), it'll repeat a group composed of either:
\\
- two backslashes (that is, represents a single literal backslash)
\'
- an escaped'
(that is, represents a single literal'
)
[^']
- Anything but a quote character
Or, it'll match d+
, the integers.
https://regex101.com/r/5beqXJ/1
add a comment |
You can use the pattern
(?<=')(?:\\|\'|[^'])+(?=',)|d+
For the string content (looks ahead and behind for '
s), it'll repeat a group composed of either:
\\
- two backslashes (that is, represents a single literal backslash)
\'
- an escaped'
(that is, represents a single literal'
)
[^']
- Anything but a quote character
Or, it'll match d+
, the integers.
https://regex101.com/r/5beqXJ/1
add a comment |
You can use the pattern
(?<=')(?:\\|\'|[^'])+(?=',)|d+
For the string content (looks ahead and behind for '
s), it'll repeat a group composed of either:
\\
- two backslashes (that is, represents a single literal backslash)
\'
- an escaped'
(that is, represents a single literal'
)
[^']
- Anything but a quote character
Or, it'll match d+
, the integers.
https://regex101.com/r/5beqXJ/1
You can use the pattern
(?<=')(?:\\|\'|[^'])+(?=',)|d+
For the string content (looks ahead and behind for '
s), it'll repeat a group composed of either:
\\
- two backslashes (that is, represents a single literal backslash)
\'
- an escaped'
(that is, represents a single literal'
)
[^']
- Anything but a quote character
Or, it'll match d+
, the integers.
https://regex101.com/r/5beqXJ/1
answered Dec 31 '18 at 0:08
CertainPerformanceCertainPerformance
83.8k144168
83.8k144168
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53982397%2fhow-can-i-extract-tuples-from-a-string%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
You should look into
ast.literal_eval
.– Scott Hunter
Dec 31 '18 at 0:00
Try
re.findall(r"""((d+),(d+),('[^'\]*(?:\.[^'\]*)*'),(d+))""", s)
, see regex101.com/r/3DMXyZ/1 and ideone.com/DlP6we– Wiktor Stribiżew
Dec 31 '18 at 0:09
What is the source of these strings?
– juanpa.arrivillaga
Dec 31 '18 at 2:24