Python turn a hash into a dataframe
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
I have a hash file looks like this and the data is separated line by line:
Amy:0001:[{'name': 'Amy', 'age': '14', 'grade': '7', 'award': '0'}]
Carl:0024:[{'name': 'Carl', 'age': '12', 'grade': '6', 'award': '2'}, {'name': 'Carl', 'age': '18', 'grade': '12', 'award': '4'}, {'name': 'Carl', 'age': '13', 'grade': '6', 'award': '7'}]
and more ...
I want to have a dataframe that look like this:
name age grade award
Amy:0001 Amy 14 7 0
Carl:0024 Carl 12 6 2
Carl:0024 Carl 18 12 4
Carl:0024 Carl 13 6 7
I tried to strip the hash line by line
lines = [line.rstrip('n') for line in open("my_file.txt")]
python pandas dictionary hash
add a comment |
I have a hash file looks like this and the data is separated line by line:
Amy:0001:[{'name': 'Amy', 'age': '14', 'grade': '7', 'award': '0'}]
Carl:0024:[{'name': 'Carl', 'age': '12', 'grade': '6', 'award': '2'}, {'name': 'Carl', 'age': '18', 'grade': '12', 'award': '4'}, {'name': 'Carl', 'age': '13', 'grade': '6', 'award': '7'}]
and more ...
I want to have a dataframe that look like this:
name age grade award
Amy:0001 Amy 14 7 0
Carl:0024 Carl 12 6 2
Carl:0024 Carl 18 12 4
Carl:0024 Carl 13 6 7
I tried to strip the hash line by line
lines = [line.rstrip('n') for line in open("my_file.txt")]
python pandas dictionary hash
add a comment |
I have a hash file looks like this and the data is separated line by line:
Amy:0001:[{'name': 'Amy', 'age': '14', 'grade': '7', 'award': '0'}]
Carl:0024:[{'name': 'Carl', 'age': '12', 'grade': '6', 'award': '2'}, {'name': 'Carl', 'age': '18', 'grade': '12', 'award': '4'}, {'name': 'Carl', 'age': '13', 'grade': '6', 'award': '7'}]
and more ...
I want to have a dataframe that look like this:
name age grade award
Amy:0001 Amy 14 7 0
Carl:0024 Carl 12 6 2
Carl:0024 Carl 18 12 4
Carl:0024 Carl 13 6 7
I tried to strip the hash line by line
lines = [line.rstrip('n') for line in open("my_file.txt")]
python pandas dictionary hash
I have a hash file looks like this and the data is separated line by line:
Amy:0001:[{'name': 'Amy', 'age': '14', 'grade': '7', 'award': '0'}]
Carl:0024:[{'name': 'Carl', 'age': '12', 'grade': '6', 'award': '2'}, {'name': 'Carl', 'age': '18', 'grade': '12', 'award': '4'}, {'name': 'Carl', 'age': '13', 'grade': '6', 'award': '7'}]
and more ...
I want to have a dataframe that look like this:
name age grade award
Amy:0001 Amy 14 7 0
Carl:0024 Carl 12 6 2
Carl:0024 Carl 18 12 4
Carl:0024 Carl 13 6 7
I tried to strip the hash line by line
lines = [line.rstrip('n') for line in open("my_file.txt")]
python pandas dictionary hash
python pandas dictionary hash
edited Jan 4 at 18:28
Yuca
3,0792826
3,0792826
asked Jan 4 at 18:27
Matt-powMatt-pow
155416
155416
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
Start with an empty DataFrame:
df = pd.DataFrame(columns=['key','name','age','grade','award'])
Line by line read the hash file into the dataframe:
import json
with open(hash_path, 'r') as f:
for line in f:
key = ":".join(line.split(":", 2)[:2])
rows = line.split(":", 2)[-1]
# json requires double quotes for strings
rows = json.loads(rows.replace("'",'"'))
for row in rows:
row['key'] = key
df = df.append(pd.Series(row), ignore_index=True)
# set the 'key' column to the index
df.set_index('key', inplace=True)
add a comment |
Here's a solution using ast.literal_eval which doesn't require explicit line-by-line iteration. You should find it considerably more efficient.
from io import StringIO
from ast import literal_eval
x = """Amy:0001:[{'name': 'Amy', 'age': '14', 'grade': '7', 'award': '0'}]
Carl:0024:[{'name': 'Carl', 'age': '12', 'grade': '6', 'award': '2'}, {'name': 'Carl', 'age': '18', 'grade': '12', 'award': '4'}, {'name': 'Carl', 'age': '13', 'grade': '6', 'award': '7'}]"""
df = pd.read_csv(StringIO(x), delimiter='[', header=None, names=['id', 'data'])
df['id'] = df['id'].str[:-1]
df['data'] = df['data'].map(lambda x: literal_eval(f'[{x}'))
lens = df['data'].str.len()
df = pd.DataFrame({'id': np.repeat(df['id'].values, lens)})
.join(pd.DataFrame(list(chain.from_iterable(df['data']))))
.set_index('id')
print(df)
age award grade name
id
Amy:0001 14 0 7 Amy
Carl:0024 12 2 6 Carl
Carl:0024 18 4 12 Carl
Carl:0024 13 7 6 Carl
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54044270%2fpython-turn-a-hash-into-a-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Start with an empty DataFrame:
df = pd.DataFrame(columns=['key','name','age','grade','award'])
Line by line read the hash file into the dataframe:
import json
with open(hash_path, 'r') as f:
for line in f:
key = ":".join(line.split(":", 2)[:2])
rows = line.split(":", 2)[-1]
# json requires double quotes for strings
rows = json.loads(rows.replace("'",'"'))
for row in rows:
row['key'] = key
df = df.append(pd.Series(row), ignore_index=True)
# set the 'key' column to the index
df.set_index('key', inplace=True)
add a comment |
Start with an empty DataFrame:
df = pd.DataFrame(columns=['key','name','age','grade','award'])
Line by line read the hash file into the dataframe:
import json
with open(hash_path, 'r') as f:
for line in f:
key = ":".join(line.split(":", 2)[:2])
rows = line.split(":", 2)[-1]
# json requires double quotes for strings
rows = json.loads(rows.replace("'",'"'))
for row in rows:
row['key'] = key
df = df.append(pd.Series(row), ignore_index=True)
# set the 'key' column to the index
df.set_index('key', inplace=True)
add a comment |
Start with an empty DataFrame:
df = pd.DataFrame(columns=['key','name','age','grade','award'])
Line by line read the hash file into the dataframe:
import json
with open(hash_path, 'r') as f:
for line in f:
key = ":".join(line.split(":", 2)[:2])
rows = line.split(":", 2)[-1]
# json requires double quotes for strings
rows = json.loads(rows.replace("'",'"'))
for row in rows:
row['key'] = key
df = df.append(pd.Series(row), ignore_index=True)
# set the 'key' column to the index
df.set_index('key', inplace=True)
Start with an empty DataFrame:
df = pd.DataFrame(columns=['key','name','age','grade','award'])
Line by line read the hash file into the dataframe:
import json
with open(hash_path, 'r') as f:
for line in f:
key = ":".join(line.split(":", 2)[:2])
rows = line.split(":", 2)[-1]
# json requires double quotes for strings
rows = json.loads(rows.replace("'",'"'))
for row in rows:
row['key'] = key
df = df.append(pd.Series(row), ignore_index=True)
# set the 'key' column to the index
df.set_index('key', inplace=True)
edited Jan 4 at 18:55
answered Jan 4 at 18:49
chet-the-wizardchet-the-wizard
833514
833514
add a comment |
add a comment |
Here's a solution using ast.literal_eval which doesn't require explicit line-by-line iteration. You should find it considerably more efficient.
from io import StringIO
from ast import literal_eval
x = """Amy:0001:[{'name': 'Amy', 'age': '14', 'grade': '7', 'award': '0'}]
Carl:0024:[{'name': 'Carl', 'age': '12', 'grade': '6', 'award': '2'}, {'name': 'Carl', 'age': '18', 'grade': '12', 'award': '4'}, {'name': 'Carl', 'age': '13', 'grade': '6', 'award': '7'}]"""
df = pd.read_csv(StringIO(x), delimiter='[', header=None, names=['id', 'data'])
df['id'] = df['id'].str[:-1]
df['data'] = df['data'].map(lambda x: literal_eval(f'[{x}'))
lens = df['data'].str.len()
df = pd.DataFrame({'id': np.repeat(df['id'].values, lens)})
.join(pd.DataFrame(list(chain.from_iterable(df['data']))))
.set_index('id')
print(df)
age award grade name
id
Amy:0001 14 0 7 Amy
Carl:0024 12 2 6 Carl
Carl:0024 18 4 12 Carl
Carl:0024 13 7 6 Carl
add a comment |
Here's a solution using ast.literal_eval which doesn't require explicit line-by-line iteration. You should find it considerably more efficient.
from io import StringIO
from ast import literal_eval
x = """Amy:0001:[{'name': 'Amy', 'age': '14', 'grade': '7', 'award': '0'}]
Carl:0024:[{'name': 'Carl', 'age': '12', 'grade': '6', 'award': '2'}, {'name': 'Carl', 'age': '18', 'grade': '12', 'award': '4'}, {'name': 'Carl', 'age': '13', 'grade': '6', 'award': '7'}]"""
df = pd.read_csv(StringIO(x), delimiter='[', header=None, names=['id', 'data'])
df['id'] = df['id'].str[:-1]
df['data'] = df['data'].map(lambda x: literal_eval(f'[{x}'))
lens = df['data'].str.len()
df = pd.DataFrame({'id': np.repeat(df['id'].values, lens)})
.join(pd.DataFrame(list(chain.from_iterable(df['data']))))
.set_index('id')
print(df)
age award grade name
id
Amy:0001 14 0 7 Amy
Carl:0024 12 2 6 Carl
Carl:0024 18 4 12 Carl
Carl:0024 13 7 6 Carl
add a comment |
Here's a solution using ast.literal_eval which doesn't require explicit line-by-line iteration. You should find it considerably more efficient.
from io import StringIO
from ast import literal_eval
x = """Amy:0001:[{'name': 'Amy', 'age': '14', 'grade': '7', 'award': '0'}]
Carl:0024:[{'name': 'Carl', 'age': '12', 'grade': '6', 'award': '2'}, {'name': 'Carl', 'age': '18', 'grade': '12', 'award': '4'}, {'name': 'Carl', 'age': '13', 'grade': '6', 'award': '7'}]"""
df = pd.read_csv(StringIO(x), delimiter='[', header=None, names=['id', 'data'])
df['id'] = df['id'].str[:-1]
df['data'] = df['data'].map(lambda x: literal_eval(f'[{x}'))
lens = df['data'].str.len()
df = pd.DataFrame({'id': np.repeat(df['id'].values, lens)})
.join(pd.DataFrame(list(chain.from_iterable(df['data']))))
.set_index('id')
print(df)
age award grade name
id
Amy:0001 14 0 7 Amy
Carl:0024 12 2 6 Carl
Carl:0024 18 4 12 Carl
Carl:0024 13 7 6 Carl
Here's a solution using ast.literal_eval which doesn't require explicit line-by-line iteration. You should find it considerably more efficient.
from io import StringIO
from ast import literal_eval
x = """Amy:0001:[{'name': 'Amy', 'age': '14', 'grade': '7', 'award': '0'}]
Carl:0024:[{'name': 'Carl', 'age': '12', 'grade': '6', 'award': '2'}, {'name': 'Carl', 'age': '18', 'grade': '12', 'award': '4'}, {'name': 'Carl', 'age': '13', 'grade': '6', 'award': '7'}]"""
df = pd.read_csv(StringIO(x), delimiter='[', header=None, names=['id', 'data'])
df['id'] = df['id'].str[:-1]
df['data'] = df['data'].map(lambda x: literal_eval(f'[{x}'))
lens = df['data'].str.len()
df = pd.DataFrame({'id': np.repeat(df['id'].values, lens)})
.join(pd.DataFrame(list(chain.from_iterable(df['data']))))
.set_index('id')
print(df)
age award grade name
id
Amy:0001 14 0 7 Amy
Carl:0024 12 2 6 Carl
Carl:0024 18 4 12 Carl
Carl:0024 13 7 6 Carl
edited Jan 4 at 21:25
answered Jan 4 at 19:07
jppjpp
103k2167117
103k2167117
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54044270%2fpython-turn-a-hash-into-a-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown