Calculating ordinal/nominal, etcetera [on hold]
I've recently begun a new study but joined the training a little later than the others which is why I'm struggling a little with Python. I was wondering if any of you could give me some pointers for the following assignment..
The program I'm supposed to set up has to calculate the following:
- From every categorical (nominal or ordinal) the characteristic mode.
- From every numerical characteristic:
x The average and standard deviation (only when it's continious data (float))
x The median and quartile distance (only when it's discrete data (integer))
I've attached the file where we're supposed to import the data from. Any pointers you guys have are welcome. Also I hope the assignment is clear enough because I had to translate it from Dutch to English.. :)
[Here is a link to the file I'm supposed to import the data from at the moment, in the end it's also supposed to work for any other file though so I'm a little confused how to go about that..]
( https://www.dropbox.com/s/67dqpvlw1qgt5di/Week7.txt?dl=0 )
Thanks in advance! :)
python
New contributor
put on hold as unclear what you're asking by glibdud, usr2564301, pault, Nate, Tomasz Mularczyk 2 days ago
Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.
add a comment |
I've recently begun a new study but joined the training a little later than the others which is why I'm struggling a little with Python. I was wondering if any of you could give me some pointers for the following assignment..
The program I'm supposed to set up has to calculate the following:
- From every categorical (nominal or ordinal) the characteristic mode.
- From every numerical characteristic:
x The average and standard deviation (only when it's continious data (float))
x The median and quartile distance (only when it's discrete data (integer))
I've attached the file where we're supposed to import the data from. Any pointers you guys have are welcome. Also I hope the assignment is clear enough because I had to translate it from Dutch to English.. :)
[Here is a link to the file I'm supposed to import the data from at the moment, in the end it's also supposed to work for any other file though so I'm a little confused how to go about that..]
( https://www.dropbox.com/s/67dqpvlw1qgt5di/Week7.txt?dl=0 )
Thanks in advance! :)
python
New contributor
put on hold as unclear what you're asking by glibdud, usr2564301, pault, Nate, Tomasz Mularczyk 2 days ago
Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.
1
Hello and Welcome to StackOverflow! :-) Please have a look at How do I ask a good question? As of now, your question is really too broad to help you. Split the problem into smaller parts (like open a file, read lines, split lines, various calculations and so on).
– Jayjayyy
2 days ago
Furthermore, please always copy and paste some code to show what you have already tried. :-) If you don't know how to properly format your code, have a look at Editing help which shows how to use Markdown here on StackOverflow.
– Jayjayyy
2 days ago
Lastly, you have tagged your question with both Python versions 2 and 3. Which one is it?
– Jayjayyy
2 days ago
add a comment |
I've recently begun a new study but joined the training a little later than the others which is why I'm struggling a little with Python. I was wondering if any of you could give me some pointers for the following assignment..
The program I'm supposed to set up has to calculate the following:
- From every categorical (nominal or ordinal) the characteristic mode.
- From every numerical characteristic:
x The average and standard deviation (only when it's continious data (float))
x The median and quartile distance (only when it's discrete data (integer))
I've attached the file where we're supposed to import the data from. Any pointers you guys have are welcome. Also I hope the assignment is clear enough because I had to translate it from Dutch to English.. :)
[Here is a link to the file I'm supposed to import the data from at the moment, in the end it's also supposed to work for any other file though so I'm a little confused how to go about that..]
( https://www.dropbox.com/s/67dqpvlw1qgt5di/Week7.txt?dl=0 )
Thanks in advance! :)
python
New contributor
I've recently begun a new study but joined the training a little later than the others which is why I'm struggling a little with Python. I was wondering if any of you could give me some pointers for the following assignment..
The program I'm supposed to set up has to calculate the following:
- From every categorical (nominal or ordinal) the characteristic mode.
- From every numerical characteristic:
x The average and standard deviation (only when it's continious data (float))
x The median and quartile distance (only when it's discrete data (integer))
I've attached the file where we're supposed to import the data from. Any pointers you guys have are welcome. Also I hope the assignment is clear enough because I had to translate it from Dutch to English.. :)
[Here is a link to the file I'm supposed to import the data from at the moment, in the end it's also supposed to work for any other file though so I'm a little confused how to go about that..]
( https://www.dropbox.com/s/67dqpvlw1qgt5di/Week7.txt?dl=0 )
Thanks in advance! :)
python
python
New contributor
New contributor
edited 2 days ago
glibdud
5,48521730
5,48521730
New contributor
asked 2 days ago
Lisa.P1108
4
4
New contributor
New contributor
put on hold as unclear what you're asking by glibdud, usr2564301, pault, Nate, Tomasz Mularczyk 2 days ago
Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.
put on hold as unclear what you're asking by glibdud, usr2564301, pault, Nate, Tomasz Mularczyk 2 days ago
Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.
1
Hello and Welcome to StackOverflow! :-) Please have a look at How do I ask a good question? As of now, your question is really too broad to help you. Split the problem into smaller parts (like open a file, read lines, split lines, various calculations and so on).
– Jayjayyy
2 days ago
Furthermore, please always copy and paste some code to show what you have already tried. :-) If you don't know how to properly format your code, have a look at Editing help which shows how to use Markdown here on StackOverflow.
– Jayjayyy
2 days ago
Lastly, you have tagged your question with both Python versions 2 and 3. Which one is it?
– Jayjayyy
2 days ago
add a comment |
1
Hello and Welcome to StackOverflow! :-) Please have a look at How do I ask a good question? As of now, your question is really too broad to help you. Split the problem into smaller parts (like open a file, read lines, split lines, various calculations and so on).
– Jayjayyy
2 days ago
Furthermore, please always copy and paste some code to show what you have already tried. :-) If you don't know how to properly format your code, have a look at Editing help which shows how to use Markdown here on StackOverflow.
– Jayjayyy
2 days ago
Lastly, you have tagged your question with both Python versions 2 and 3. Which one is it?
– Jayjayyy
2 days ago
1
1
Hello and Welcome to StackOverflow! :-) Please have a look at How do I ask a good question? As of now, your question is really too broad to help you. Split the problem into smaller parts (like open a file, read lines, split lines, various calculations and so on).
– Jayjayyy
2 days ago
Hello and Welcome to StackOverflow! :-) Please have a look at How do I ask a good question? As of now, your question is really too broad to help you. Split the problem into smaller parts (like open a file, read lines, split lines, various calculations and so on).
– Jayjayyy
2 days ago
Furthermore, please always copy and paste some code to show what you have already tried. :-) If you don't know how to properly format your code, have a look at Editing help which shows how to use Markdown here on StackOverflow.
– Jayjayyy
2 days ago
Furthermore, please always copy and paste some code to show what you have already tried. :-) If you don't know how to properly format your code, have a look at Editing help which shows how to use Markdown here on StackOverflow.
– Jayjayyy
2 days ago
Lastly, you have tagged your question with both Python versions 2 and 3. Which one is it?
– Jayjayyy
2 days ago
Lastly, you have tagged your question with both Python versions 2 and 3. Which one is it?
– Jayjayyy
2 days ago
add a comment |
1 Answer
1
active
oldest
votes
This is how I would approach the problem with Python 3
and pandas
:
import pandas as pd
# function to compute quantile range
def iqr(series):
'''
Interqantile range according to wikipedia
https://en.wikipedia.org/wiki/Interquartile_range
'''
q_1 = series.quantile(0.25)
q_3 = series.quantile(0.75)
return q_3 - q_1
# load data from text file
data = pd.read_csv(
'Week7.txt',
delimiter=';',
decimal=',')
# define what columns should be treated and what we want to get from data
treatment = {
'sellerRating': ['mean', 'std', 'median', iqr],
'Duration': ['mean', 'std', 'median', iqr],
'ClosePrice': ['mean', 'std'],
'OpenPrice': ['mean', 'std']}
# calculations - a single line with pandas
result = data.groupby('Category').agg(treatment)
To see the actual result you may use, for example, print(result.loc[:, 'ClosePrice'])
which gives:
mean std
Category
Antique/Art/Craft 14.854783 28.759996
Automotive 108.915816 196.991289
Books 36.636818 97.639969
Business/Industrial 16.514000 18.968667
Clothing/Accessories 35.880678 48.256547
Coins/Stamps 30.408182 48.803024
Collectibles 41.546339 121.047183
Computer 73.247143 119.775838
Electronics 74.777273 81.107397
EverythingElse 45.026667 64.859379
Health/Beauty 10.199697 9.922211
Home/Garden 30.356250 78.543460
Jewelry 30.271364 111.316375
Music/Movie/Game 8.363206 9.836130
Photography 447.956667 453.364402
Pottery/Glass 64.574286 61.948767
SportingGoods 74.090282 87.198602
Toys/Hobbies 23.030650 67.269334
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
This is how I would approach the problem with Python 3
and pandas
:
import pandas as pd
# function to compute quantile range
def iqr(series):
'''
Interqantile range according to wikipedia
https://en.wikipedia.org/wiki/Interquartile_range
'''
q_1 = series.quantile(0.25)
q_3 = series.quantile(0.75)
return q_3 - q_1
# load data from text file
data = pd.read_csv(
'Week7.txt',
delimiter=';',
decimal=',')
# define what columns should be treated and what we want to get from data
treatment = {
'sellerRating': ['mean', 'std', 'median', iqr],
'Duration': ['mean', 'std', 'median', iqr],
'ClosePrice': ['mean', 'std'],
'OpenPrice': ['mean', 'std']}
# calculations - a single line with pandas
result = data.groupby('Category').agg(treatment)
To see the actual result you may use, for example, print(result.loc[:, 'ClosePrice'])
which gives:
mean std
Category
Antique/Art/Craft 14.854783 28.759996
Automotive 108.915816 196.991289
Books 36.636818 97.639969
Business/Industrial 16.514000 18.968667
Clothing/Accessories 35.880678 48.256547
Coins/Stamps 30.408182 48.803024
Collectibles 41.546339 121.047183
Computer 73.247143 119.775838
Electronics 74.777273 81.107397
EverythingElse 45.026667 64.859379
Health/Beauty 10.199697 9.922211
Home/Garden 30.356250 78.543460
Jewelry 30.271364 111.316375
Music/Movie/Game 8.363206 9.836130
Photography 447.956667 453.364402
Pottery/Glass 64.574286 61.948767
SportingGoods 74.090282 87.198602
Toys/Hobbies 23.030650 67.269334
add a comment |
This is how I would approach the problem with Python 3
and pandas
:
import pandas as pd
# function to compute quantile range
def iqr(series):
'''
Interqantile range according to wikipedia
https://en.wikipedia.org/wiki/Interquartile_range
'''
q_1 = series.quantile(0.25)
q_3 = series.quantile(0.75)
return q_3 - q_1
# load data from text file
data = pd.read_csv(
'Week7.txt',
delimiter=';',
decimal=',')
# define what columns should be treated and what we want to get from data
treatment = {
'sellerRating': ['mean', 'std', 'median', iqr],
'Duration': ['mean', 'std', 'median', iqr],
'ClosePrice': ['mean', 'std'],
'OpenPrice': ['mean', 'std']}
# calculations - a single line with pandas
result = data.groupby('Category').agg(treatment)
To see the actual result you may use, for example, print(result.loc[:, 'ClosePrice'])
which gives:
mean std
Category
Antique/Art/Craft 14.854783 28.759996
Automotive 108.915816 196.991289
Books 36.636818 97.639969
Business/Industrial 16.514000 18.968667
Clothing/Accessories 35.880678 48.256547
Coins/Stamps 30.408182 48.803024
Collectibles 41.546339 121.047183
Computer 73.247143 119.775838
Electronics 74.777273 81.107397
EverythingElse 45.026667 64.859379
Health/Beauty 10.199697 9.922211
Home/Garden 30.356250 78.543460
Jewelry 30.271364 111.316375
Music/Movie/Game 8.363206 9.836130
Photography 447.956667 453.364402
Pottery/Glass 64.574286 61.948767
SportingGoods 74.090282 87.198602
Toys/Hobbies 23.030650 67.269334
add a comment |
This is how I would approach the problem with Python 3
and pandas
:
import pandas as pd
# function to compute quantile range
def iqr(series):
'''
Interqantile range according to wikipedia
https://en.wikipedia.org/wiki/Interquartile_range
'''
q_1 = series.quantile(0.25)
q_3 = series.quantile(0.75)
return q_3 - q_1
# load data from text file
data = pd.read_csv(
'Week7.txt',
delimiter=';',
decimal=',')
# define what columns should be treated and what we want to get from data
treatment = {
'sellerRating': ['mean', 'std', 'median', iqr],
'Duration': ['mean', 'std', 'median', iqr],
'ClosePrice': ['mean', 'std'],
'OpenPrice': ['mean', 'std']}
# calculations - a single line with pandas
result = data.groupby('Category').agg(treatment)
To see the actual result you may use, for example, print(result.loc[:, 'ClosePrice'])
which gives:
mean std
Category
Antique/Art/Craft 14.854783 28.759996
Automotive 108.915816 196.991289
Books 36.636818 97.639969
Business/Industrial 16.514000 18.968667
Clothing/Accessories 35.880678 48.256547
Coins/Stamps 30.408182 48.803024
Collectibles 41.546339 121.047183
Computer 73.247143 119.775838
Electronics 74.777273 81.107397
EverythingElse 45.026667 64.859379
Health/Beauty 10.199697 9.922211
Home/Garden 30.356250 78.543460
Jewelry 30.271364 111.316375
Music/Movie/Game 8.363206 9.836130
Photography 447.956667 453.364402
Pottery/Glass 64.574286 61.948767
SportingGoods 74.090282 87.198602
Toys/Hobbies 23.030650 67.269334
This is how I would approach the problem with Python 3
and pandas
:
import pandas as pd
# function to compute quantile range
def iqr(series):
'''
Interqantile range according to wikipedia
https://en.wikipedia.org/wiki/Interquartile_range
'''
q_1 = series.quantile(0.25)
q_3 = series.quantile(0.75)
return q_3 - q_1
# load data from text file
data = pd.read_csv(
'Week7.txt',
delimiter=';',
decimal=',')
# define what columns should be treated and what we want to get from data
treatment = {
'sellerRating': ['mean', 'std', 'median', iqr],
'Duration': ['mean', 'std', 'median', iqr],
'ClosePrice': ['mean', 'std'],
'OpenPrice': ['mean', 'std']}
# calculations - a single line with pandas
result = data.groupby('Category').agg(treatment)
To see the actual result you may use, for example, print(result.loc[:, 'ClosePrice'])
which gives:
mean std
Category
Antique/Art/Craft 14.854783 28.759996
Automotive 108.915816 196.991289
Books 36.636818 97.639969
Business/Industrial 16.514000 18.968667
Clothing/Accessories 35.880678 48.256547
Coins/Stamps 30.408182 48.803024
Collectibles 41.546339 121.047183
Computer 73.247143 119.775838
Electronics 74.777273 81.107397
EverythingElse 45.026667 64.859379
Health/Beauty 10.199697 9.922211
Home/Garden 30.356250 78.543460
Jewelry 30.271364 111.316375
Music/Movie/Game 8.363206 9.836130
Photography 447.956667 453.364402
Pottery/Glass 64.574286 61.948767
SportingGoods 74.090282 87.198602
Toys/Hobbies 23.030650 67.269334
answered 2 days ago
Poolka
1,314128
1,314128
add a comment |
add a comment |
1
Hello and Welcome to StackOverflow! :-) Please have a look at How do I ask a good question? As of now, your question is really too broad to help you. Split the problem into smaller parts (like open a file, read lines, split lines, various calculations and so on).
– Jayjayyy
2 days ago
Furthermore, please always copy and paste some code to show what you have already tried. :-) If you don't know how to properly format your code, have a look at Editing help which shows how to use Markdown here on StackOverflow.
– Jayjayyy
2 days ago
Lastly, you have tagged your question with both Python versions 2 and 3. Which one is it?
– Jayjayyy
2 days ago