Calculating ordinal/nominal, etcetera [on hold]

-1

I've recently begun a new study but joined the training a little later than the others which is why I'm struggling a little with Python. I was wondering if any of you could give me some pointers for the following assignment..

The program I'm supposed to set up has to calculate the following:
- From every categorical (nominal or ordinal) the characteristic mode.
- From every numerical characteristic:
x The average and standard deviation (only when it's continious data (float))
x The median and quartile distance (only when it's discrete data (integer))

I've attached the file where we're supposed to import the data from. Any pointers you guys have are welcome. Also I hope the assignment is clear enough because I had to translate it from Dutch to English.. :)

[Here is a link to the file I'm supposed to import the data from at the moment, in the end it's also supposed to work for any other file though so I'm a little confused how to go about that..]
( https://www.dropbox.com/s/67dqpvlw1qgt5di/Week7.txt?dl=0 )

Thanks in advance! :)

edited 2 days ago

glibdud

5,48521730

asked 2 days ago

Lisa.P1108

New contributor

put on hold as unclear what you're asking by glibdud, usr2564301, pault, Nate, Tomasz Mularczyk 2 days ago

Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.

1

Hello and Welcome to StackOverflow! :-) Please have a look at How do I ask a good question? As of now, your question is really too broad to help you. Split the problem into smaller parts (like open a file, read lines, split lines, various calculations and so on).
– Jayjayyy
2 days ago

Furthermore, please always copy and paste some code to show what you have already tried. :-) If you don't know how to properly format your code, have a look at Editing help which shows how to use Markdown here on StackOverflow.
– Jayjayyy
2 days ago

Lastly, you have tagged your question with both Python versions 2 and 3. Which one is it?
– Jayjayyy
2 days ago

add a comment |

-1

Thanks in advance! :)

edited 2 days ago

glibdud

5,48521730

asked 2 days ago

Lisa.P1108

New contributor

put on hold as unclear what you're asking by glibdud, usr2564301, pault, Nate, Tomasz Mularczyk 2 days ago

1

Hello and Welcome to StackOverflow! :-) Please have a look at How do I ask a good question? As of now, your question is really too broad to help you. Split the problem into smaller parts (like open a file, read lines, split lines, various calculations and so on).
– Jayjayyy
2 days ago

Furthermore, please always copy and paste some code to show what you have already tried. :-) If you don't know how to properly format your code, have a look at Editing help which shows how to use Markdown here on StackOverflow.
– Jayjayyy
2 days ago

Lastly, you have tagged your question with both Python versions 2 and 3. Which one is it?
– Jayjayyy
2 days ago

add a comment |

-1

Thanks in advance! :)

edited 2 days ago

glibdud

5,48521730

asked 2 days ago

Lisa.P1108

New contributor

Thanks in advance! :)

python

edited 2 days ago

glibdud

5,48521730

asked 2 days ago

Lisa.P1108

New contributor

edited 2 days ago

glibdud

5,48521730

asked 2 days ago

Lisa.P1108

New contributor

edited 2 days ago

glibdud

5,48521730

edited 2 days ago

glibdud

5,48521730

edited 2 days ago

glibdud

5,48521730

asked 2 days ago

Lisa.P1108

New contributor

asked 2 days ago

Lisa.P1108

asked 2 days ago

Lisa.P1108

New contributor

Lisa.P1108 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

put on hold as unclear what you're asking by glibdud, usr2564301, pault, Nate, Tomasz Mularczyk 2 days ago

1

Hello and Welcome to StackOverflow! :-) Please have a look at How do I ask a good question? As of now, your question is really too broad to help you. Split the problem into smaller parts (like open a file, read lines, split lines, various calculations and so on).
– Jayjayyy
2 days ago

Furthermore, please always copy and paste some code to show what you have already tried. :-) If you don't know how to properly format your code, have a look at Editing help which shows how to use Markdown here on StackOverflow.
– Jayjayyy
2 days ago

Lastly, you have tagged your question with both Python versions 2 and 3. Which one is it?
– Jayjayyy
2 days ago

add a comment |

1

Hello and Welcome to StackOverflow! :-) Please have a look at How do I ask a good question? As of now, your question is really too broad to help you. Split the problem into smaller parts (like open a file, read lines, split lines, various calculations and so on).
– Jayjayyy
2 days ago

Furthermore, please always copy and paste some code to show what you have already tried. :-) If you don't know how to properly format your code, have a look at Editing help which shows how to use Markdown here on StackOverflow.
– Jayjayyy
2 days ago

Lastly, you have tagged your question with both Python versions 2 and 3. Which one is it?
– Jayjayyy
2 days ago

Hello and Welcome to StackOverflow! :-) Please have a look at How do I ask a good question? As of now, your question is really too broad to help you. Split the problem into smaller parts (like open a file, read lines, split lines, various calculations and so on).
– Jayjayyy
2 days ago

Furthermore, please always copy and paste some code to show what you have already tried. :-) If you don't know how to properly format your code, have a look at Editing help which shows how to use Markdown here on StackOverflow.
– Jayjayyy
2 days ago

Lastly, you have tagged your question with both Python versions 2 and 3. Which one is it?
– Jayjayyy
2 days ago

add a comment |

1 Answer
1

active

oldest

votes

This is how I would approach the problem with Python 3 and pandas:

import pandas as pd



# function to compute quantile range

def iqr(series):

    '''

    Interqantile range according to wikipedia

    https://en.wikipedia.org/wiki/Interquartile_range

    '''

    q_1 = series.quantile(0.25)

    q_3 = series.quantile(0.75)

    return q_3 - q_1



# load data from text file

data = pd.read_csv(

    'Week7.txt',

    delimiter=';',

    decimal=',')



# define what columns should be treated and what we want to get from data

treatment = {

    'sellerRating': ['mean', 'std', 'median', iqr],

    'Duration': ['mean', 'std', 'median', iqr],

    'ClosePrice': ['mean', 'std'],

    'OpenPrice': ['mean', 'std']}



# calculations - a single line with pandas

result = data.groupby('Category').agg(treatment)

To see the actual result you may use, for example, print(result.loc[:, 'ClosePrice']) which gives:

                            mean         std

Category                                    

Antique/Art/Craft      14.854783   28.759996

Automotive            108.915816  196.991289

Books                  36.636818   97.639969

Business/Industrial    16.514000   18.968667

Clothing/Accessories   35.880678   48.256547

Coins/Stamps           30.408182   48.803024

Collectibles           41.546339  121.047183

Computer               73.247143  119.775838

Electronics            74.777273   81.107397

EverythingElse         45.026667   64.859379

Health/Beauty          10.199697    9.922211

Home/Garden            30.356250   78.543460

Jewelry                30.271364  111.316375

Music/Movie/Game        8.363206    9.836130

Photography           447.956667  453.364402

Pottery/Glass          64.574286   61.948767

SportingGoods          74.090282   87.198602

Toys/Hobbies           23.030650   67.269334

answered 2 days ago

Poolka

1,314128

add a comment |

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

This is how I would approach the problem with Python 3 and pandas:

import pandas as pd



# function to compute quantile range

def iqr(series):

    '''

    Interqantile range according to wikipedia

    https://en.wikipedia.org/wiki/Interquartile_range

    '''

    q_1 = series.quantile(0.25)

    q_3 = series.quantile(0.75)

    return q_3 - q_1



# load data from text file

data = pd.read_csv(

    'Week7.txt',

    delimiter=';',

    decimal=',')



# define what columns should be treated and what we want to get from data

treatment = {

    'sellerRating': ['mean', 'std', 'median', iqr],

    'Duration': ['mean', 'std', 'median', iqr],

    'ClosePrice': ['mean', 'std'],

    'OpenPrice': ['mean', 'std']}



# calculations - a single line with pandas

result = data.groupby('Category').agg(treatment)

To see the actual result you may use, for example, print(result.loc[:, 'ClosePrice']) which gives:

                            mean         std

Category                                    

Antique/Art/Craft      14.854783   28.759996

Automotive            108.915816  196.991289

Books                  36.636818   97.639969

Business/Industrial    16.514000   18.968667

Clothing/Accessories   35.880678   48.256547

Coins/Stamps           30.408182   48.803024

Collectibles           41.546339  121.047183

Computer               73.247143  119.775838

Electronics            74.777273   81.107397

EverythingElse         45.026667   64.859379

Health/Beauty          10.199697    9.922211

Home/Garden            30.356250   78.543460

Jewelry                30.271364  111.316375

Music/Movie/Game        8.363206    9.836130

Photography           447.956667  453.364402

Pottery/Glass          64.574286   61.948767

SportingGoods          74.090282   87.198602

Toys/Hobbies           23.030650   67.269334

answered 2 days ago

Poolka

1,314128

add a comment |

This is how I would approach the problem with Python 3 and pandas:

import pandas as pd



# function to compute quantile range

def iqr(series):

    '''

    Interqantile range according to wikipedia

    https://en.wikipedia.org/wiki/Interquartile_range

    '''

    q_1 = series.quantile(0.25)

    q_3 = series.quantile(0.75)

    return q_3 - q_1



# load data from text file

data = pd.read_csv(

    'Week7.txt',

    delimiter=';',

    decimal=',')



# define what columns should be treated and what we want to get from data

treatment = {

    'sellerRating': ['mean', 'std', 'median', iqr],

    'Duration': ['mean', 'std', 'median', iqr],

    'ClosePrice': ['mean', 'std'],

    'OpenPrice': ['mean', 'std']}



# calculations - a single line with pandas

result = data.groupby('Category').agg(treatment)

To see the actual result you may use, for example, print(result.loc[:, 'ClosePrice']) which gives:

                            mean         std

Category                                    

Antique/Art/Craft      14.854783   28.759996

Automotive            108.915816  196.991289

Books                  36.636818   97.639969

Business/Industrial    16.514000   18.968667

Clothing/Accessories   35.880678   48.256547

Coins/Stamps           30.408182   48.803024

Collectibles           41.546339  121.047183

Computer               73.247143  119.775838

Electronics            74.777273   81.107397

EverythingElse         45.026667   64.859379

Health/Beauty          10.199697    9.922211

Home/Garden            30.356250   78.543460

Jewelry                30.271364  111.316375

Music/Movie/Game        8.363206    9.836130

Photography           447.956667  453.364402

Pottery/Glass          64.574286   61.948767

SportingGoods          74.090282   87.198602

Toys/Hobbies           23.030650   67.269334

answered 2 days ago

Poolka

1,314128

add a comment |

This is how I would approach the problem with Python 3 and pandas:

import pandas as pd



# function to compute quantile range

def iqr(series):

    '''

    Interqantile range according to wikipedia

    https://en.wikipedia.org/wiki/Interquartile_range

    '''

    q_1 = series.quantile(0.25)

    q_3 = series.quantile(0.75)

    return q_3 - q_1



# load data from text file

data = pd.read_csv(

    'Week7.txt',

    delimiter=';',

    decimal=',')



# define what columns should be treated and what we want to get from data

treatment = {

    'sellerRating': ['mean', 'std', 'median', iqr],

    'Duration': ['mean', 'std', 'median', iqr],

    'ClosePrice': ['mean', 'std'],

    'OpenPrice': ['mean', 'std']}



# calculations - a single line with pandas

result = data.groupby('Category').agg(treatment)

To see the actual result you may use, for example, print(result.loc[:, 'ClosePrice']) which gives:

                            mean         std

Category                                    

Antique/Art/Craft      14.854783   28.759996

Automotive            108.915816  196.991289

Books                  36.636818   97.639969

Business/Industrial    16.514000   18.968667

Clothing/Accessories   35.880678   48.256547

Coins/Stamps           30.408182   48.803024

Collectibles           41.546339  121.047183

Computer               73.247143  119.775838

Electronics            74.777273   81.107397

EverythingElse         45.026667   64.859379

Health/Beauty          10.199697    9.922211

Home/Garden            30.356250   78.543460

Jewelry                30.271364  111.316375

Music/Movie/Game        8.363206    9.836130

Photography           447.956667  453.364402

Pottery/Glass          64.574286   61.948767

SportingGoods          74.090282   87.198602

Toys/Hobbies           23.030650   67.269334

answered 2 days ago

Poolka

1,314128

This is how I would approach the problem with Python 3 and pandas:

import pandas as pd



# function to compute quantile range

def iqr(series):

    '''

    Interqantile range according to wikipedia

    https://en.wikipedia.org/wiki/Interquartile_range

    '''

    q_1 = series.quantile(0.25)

    q_3 = series.quantile(0.75)

    return q_3 - q_1



# load data from text file

data = pd.read_csv(

    'Week7.txt',

    delimiter=';',

    decimal=',')



# define what columns should be treated and what we want to get from data

treatment = {

    'sellerRating': ['mean', 'std', 'median', iqr],

    'Duration': ['mean', 'std', 'median', iqr],

    'ClosePrice': ['mean', 'std'],

    'OpenPrice': ['mean', 'std']}



# calculations - a single line with pandas

result = data.groupby('Category').agg(treatment)

To see the actual result you may use, for example, print(result.loc[:, 'ClosePrice']) which gives:

                            mean         std

Category                                    

Antique/Art/Craft      14.854783   28.759996

Automotive            108.915816  196.991289

Books                  36.636818   97.639969

Business/Industrial    16.514000   18.968667

Clothing/Accessories   35.880678   48.256547

Coins/Stamps           30.408182   48.803024

Collectibles           41.546339  121.047183

Computer               73.247143  119.775838

Electronics            74.777273   81.107397

EverythingElse         45.026667   64.859379

Health/Beauty          10.199697    9.922211

Home/Garden            30.356250   78.543460

Jewelry                30.271364  111.316375

Music/Movie/Game        8.363206    9.836130

Photography           447.956667  453.364402

Pottery/Glass          64.574286   61.948767

SportingGoods          74.090282   87.198602

Toys/Hobbies           23.030650   67.269334

answered 2 days ago

Poolka

1,314128

answered 2 days ago

Poolka

1,314128

answered 2 days ago

Poolka

1,314128

answered 2 days ago

Poolka

1,314128

add a comment |

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Bdtjtk