Calculating ordinal/nominal, etcetera [on hold]












-1














I've recently begun a new study but joined the training a little later than the others which is why I'm struggling a little with Python. I was wondering if any of you could give me some pointers for the following assignment..



The program I'm supposed to set up has to calculate the following:
- From every categorical (nominal or ordinal) the characteristic mode.
- From every numerical characteristic:
x The average and standard deviation (only when it's continious data (float))
x The median and quartile distance (only when it's discrete data (integer))



I've attached the file where we're supposed to import the data from. Any pointers you guys have are welcome. Also I hope the assignment is clear enough because I had to translate it from Dutch to English.. :)



[Here is a link to the file I'm supposed to import the data from at the moment, in the end it's also supposed to work for any other file though so I'm a little confused how to go about that..]
( https://www.dropbox.com/s/67dqpvlw1qgt5di/Week7.txt?dl=0 )



Thanks in advance! :)










share|improve this question









New contributor




Lisa.P1108 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











put on hold as unclear what you're asking by glibdud, usr2564301, pault, Nate, Tomasz Mularczyk 2 days ago


Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.











  • 1




    Hello and Welcome to StackOverflow! :-) Please have a look at How do I ask a good question? As of now, your question is really too broad to help you. Split the problem into smaller parts (like open a file, read lines, split lines, various calculations and so on).
    – Jayjayyy
    2 days ago










  • Furthermore, please always copy and paste some code to show what you have already tried. :-) If you don't know how to properly format your code, have a look at Editing help which shows how to use Markdown here on StackOverflow.
    – Jayjayyy
    2 days ago










  • Lastly, you have tagged your question with both Python versions 2 and 3. Which one is it?
    – Jayjayyy
    2 days ago
















-1














I've recently begun a new study but joined the training a little later than the others which is why I'm struggling a little with Python. I was wondering if any of you could give me some pointers for the following assignment..



The program I'm supposed to set up has to calculate the following:
- From every categorical (nominal or ordinal) the characteristic mode.
- From every numerical characteristic:
x The average and standard deviation (only when it's continious data (float))
x The median and quartile distance (only when it's discrete data (integer))



I've attached the file where we're supposed to import the data from. Any pointers you guys have are welcome. Also I hope the assignment is clear enough because I had to translate it from Dutch to English.. :)



[Here is a link to the file I'm supposed to import the data from at the moment, in the end it's also supposed to work for any other file though so I'm a little confused how to go about that..]
( https://www.dropbox.com/s/67dqpvlw1qgt5di/Week7.txt?dl=0 )



Thanks in advance! :)










share|improve this question









New contributor




Lisa.P1108 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











put on hold as unclear what you're asking by glibdud, usr2564301, pault, Nate, Tomasz Mularczyk 2 days ago


Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.











  • 1




    Hello and Welcome to StackOverflow! :-) Please have a look at How do I ask a good question? As of now, your question is really too broad to help you. Split the problem into smaller parts (like open a file, read lines, split lines, various calculations and so on).
    – Jayjayyy
    2 days ago










  • Furthermore, please always copy and paste some code to show what you have already tried. :-) If you don't know how to properly format your code, have a look at Editing help which shows how to use Markdown here on StackOverflow.
    – Jayjayyy
    2 days ago










  • Lastly, you have tagged your question with both Python versions 2 and 3. Which one is it?
    – Jayjayyy
    2 days ago














-1












-1








-1







I've recently begun a new study but joined the training a little later than the others which is why I'm struggling a little with Python. I was wondering if any of you could give me some pointers for the following assignment..



The program I'm supposed to set up has to calculate the following:
- From every categorical (nominal or ordinal) the characteristic mode.
- From every numerical characteristic:
x The average and standard deviation (only when it's continious data (float))
x The median and quartile distance (only when it's discrete data (integer))



I've attached the file where we're supposed to import the data from. Any pointers you guys have are welcome. Also I hope the assignment is clear enough because I had to translate it from Dutch to English.. :)



[Here is a link to the file I'm supposed to import the data from at the moment, in the end it's also supposed to work for any other file though so I'm a little confused how to go about that..]
( https://www.dropbox.com/s/67dqpvlw1qgt5di/Week7.txt?dl=0 )



Thanks in advance! :)










share|improve this question









New contributor




Lisa.P1108 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











I've recently begun a new study but joined the training a little later than the others which is why I'm struggling a little with Python. I was wondering if any of you could give me some pointers for the following assignment..



The program I'm supposed to set up has to calculate the following:
- From every categorical (nominal or ordinal) the characteristic mode.
- From every numerical characteristic:
x The average and standard deviation (only when it's continious data (float))
x The median and quartile distance (only when it's discrete data (integer))



I've attached the file where we're supposed to import the data from. Any pointers you guys have are welcome. Also I hope the assignment is clear enough because I had to translate it from Dutch to English.. :)



[Here is a link to the file I'm supposed to import the data from at the moment, in the end it's also supposed to work for any other file though so I'm a little confused how to go about that..]
( https://www.dropbox.com/s/67dqpvlw1qgt5di/Week7.txt?dl=0 )



Thanks in advance! :)







python






share|improve this question









New contributor




Lisa.P1108 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




Lisa.P1108 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited 2 days ago









glibdud

5,48521730




5,48521730






New contributor




Lisa.P1108 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 2 days ago









Lisa.P1108

4




4




New contributor




Lisa.P1108 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





Lisa.P1108 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






Lisa.P1108 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




put on hold as unclear what you're asking by glibdud, usr2564301, pault, Nate, Tomasz Mularczyk 2 days ago


Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.






put on hold as unclear what you're asking by glibdud, usr2564301, pault, Nate, Tomasz Mularczyk 2 days ago


Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.










  • 1




    Hello and Welcome to StackOverflow! :-) Please have a look at How do I ask a good question? As of now, your question is really too broad to help you. Split the problem into smaller parts (like open a file, read lines, split lines, various calculations and so on).
    – Jayjayyy
    2 days ago










  • Furthermore, please always copy and paste some code to show what you have already tried. :-) If you don't know how to properly format your code, have a look at Editing help which shows how to use Markdown here on StackOverflow.
    – Jayjayyy
    2 days ago










  • Lastly, you have tagged your question with both Python versions 2 and 3. Which one is it?
    – Jayjayyy
    2 days ago














  • 1




    Hello and Welcome to StackOverflow! :-) Please have a look at How do I ask a good question? As of now, your question is really too broad to help you. Split the problem into smaller parts (like open a file, read lines, split lines, various calculations and so on).
    – Jayjayyy
    2 days ago










  • Furthermore, please always copy and paste some code to show what you have already tried. :-) If you don't know how to properly format your code, have a look at Editing help which shows how to use Markdown here on StackOverflow.
    – Jayjayyy
    2 days ago










  • Lastly, you have tagged your question with both Python versions 2 and 3. Which one is it?
    – Jayjayyy
    2 days ago








1




1




Hello and Welcome to StackOverflow! :-) Please have a look at How do I ask a good question? As of now, your question is really too broad to help you. Split the problem into smaller parts (like open a file, read lines, split lines, various calculations and so on).
– Jayjayyy
2 days ago




Hello and Welcome to StackOverflow! :-) Please have a look at How do I ask a good question? As of now, your question is really too broad to help you. Split the problem into smaller parts (like open a file, read lines, split lines, various calculations and so on).
– Jayjayyy
2 days ago












Furthermore, please always copy and paste some code to show what you have already tried. :-) If you don't know how to properly format your code, have a look at Editing help which shows how to use Markdown here on StackOverflow.
– Jayjayyy
2 days ago




Furthermore, please always copy and paste some code to show what you have already tried. :-) If you don't know how to properly format your code, have a look at Editing help which shows how to use Markdown here on StackOverflow.
– Jayjayyy
2 days ago












Lastly, you have tagged your question with both Python versions 2 and 3. Which one is it?
– Jayjayyy
2 days ago




Lastly, you have tagged your question with both Python versions 2 and 3. Which one is it?
– Jayjayyy
2 days ago












1 Answer
1






active

oldest

votes


















0














This is how I would approach the problem with Python 3 and pandas:



import pandas as pd

# function to compute quantile range
def iqr(series):
'''
Interqantile range according to wikipedia
https://en.wikipedia.org/wiki/Interquartile_range
'''
q_1 = series.quantile(0.25)
q_3 = series.quantile(0.75)
return q_3 - q_1

# load data from text file
data = pd.read_csv(
'Week7.txt',
delimiter=';',
decimal=',')

# define what columns should be treated and what we want to get from data
treatment = {
'sellerRating': ['mean', 'std', 'median', iqr],
'Duration': ['mean', 'std', 'median', iqr],
'ClosePrice': ['mean', 'std'],
'OpenPrice': ['mean', 'std']}

# calculations - a single line with pandas
result = data.groupby('Category').agg(treatment)


To see the actual result you may use, for example, print(result.loc[:, 'ClosePrice']) which gives:



                            mean         std
Category
Antique/Art/Craft 14.854783 28.759996
Automotive 108.915816 196.991289
Books 36.636818 97.639969
Business/Industrial 16.514000 18.968667
Clothing/Accessories 35.880678 48.256547
Coins/Stamps 30.408182 48.803024
Collectibles 41.546339 121.047183
Computer 73.247143 119.775838
Electronics 74.777273 81.107397
EverythingElse 45.026667 64.859379
Health/Beauty 10.199697 9.922211
Home/Garden 30.356250 78.543460
Jewelry 30.271364 111.316375
Music/Movie/Game 8.363206 9.836130
Photography 447.956667 453.364402
Pottery/Glass 64.574286 61.948767
SportingGoods 74.090282 87.198602
Toys/Hobbies 23.030650 67.269334





share|improve this answer




























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    This is how I would approach the problem with Python 3 and pandas:



    import pandas as pd

    # function to compute quantile range
    def iqr(series):
    '''
    Interqantile range according to wikipedia
    https://en.wikipedia.org/wiki/Interquartile_range
    '''
    q_1 = series.quantile(0.25)
    q_3 = series.quantile(0.75)
    return q_3 - q_1

    # load data from text file
    data = pd.read_csv(
    'Week7.txt',
    delimiter=';',
    decimal=',')

    # define what columns should be treated and what we want to get from data
    treatment = {
    'sellerRating': ['mean', 'std', 'median', iqr],
    'Duration': ['mean', 'std', 'median', iqr],
    'ClosePrice': ['mean', 'std'],
    'OpenPrice': ['mean', 'std']}

    # calculations - a single line with pandas
    result = data.groupby('Category').agg(treatment)


    To see the actual result you may use, for example, print(result.loc[:, 'ClosePrice']) which gives:



                                mean         std
    Category
    Antique/Art/Craft 14.854783 28.759996
    Automotive 108.915816 196.991289
    Books 36.636818 97.639969
    Business/Industrial 16.514000 18.968667
    Clothing/Accessories 35.880678 48.256547
    Coins/Stamps 30.408182 48.803024
    Collectibles 41.546339 121.047183
    Computer 73.247143 119.775838
    Electronics 74.777273 81.107397
    EverythingElse 45.026667 64.859379
    Health/Beauty 10.199697 9.922211
    Home/Garden 30.356250 78.543460
    Jewelry 30.271364 111.316375
    Music/Movie/Game 8.363206 9.836130
    Photography 447.956667 453.364402
    Pottery/Glass 64.574286 61.948767
    SportingGoods 74.090282 87.198602
    Toys/Hobbies 23.030650 67.269334





    share|improve this answer


























      0














      This is how I would approach the problem with Python 3 and pandas:



      import pandas as pd

      # function to compute quantile range
      def iqr(series):
      '''
      Interqantile range according to wikipedia
      https://en.wikipedia.org/wiki/Interquartile_range
      '''
      q_1 = series.quantile(0.25)
      q_3 = series.quantile(0.75)
      return q_3 - q_1

      # load data from text file
      data = pd.read_csv(
      'Week7.txt',
      delimiter=';',
      decimal=',')

      # define what columns should be treated and what we want to get from data
      treatment = {
      'sellerRating': ['mean', 'std', 'median', iqr],
      'Duration': ['mean', 'std', 'median', iqr],
      'ClosePrice': ['mean', 'std'],
      'OpenPrice': ['mean', 'std']}

      # calculations - a single line with pandas
      result = data.groupby('Category').agg(treatment)


      To see the actual result you may use, for example, print(result.loc[:, 'ClosePrice']) which gives:



                                  mean         std
      Category
      Antique/Art/Craft 14.854783 28.759996
      Automotive 108.915816 196.991289
      Books 36.636818 97.639969
      Business/Industrial 16.514000 18.968667
      Clothing/Accessories 35.880678 48.256547
      Coins/Stamps 30.408182 48.803024
      Collectibles 41.546339 121.047183
      Computer 73.247143 119.775838
      Electronics 74.777273 81.107397
      EverythingElse 45.026667 64.859379
      Health/Beauty 10.199697 9.922211
      Home/Garden 30.356250 78.543460
      Jewelry 30.271364 111.316375
      Music/Movie/Game 8.363206 9.836130
      Photography 447.956667 453.364402
      Pottery/Glass 64.574286 61.948767
      SportingGoods 74.090282 87.198602
      Toys/Hobbies 23.030650 67.269334





      share|improve this answer
























        0












        0








        0






        This is how I would approach the problem with Python 3 and pandas:



        import pandas as pd

        # function to compute quantile range
        def iqr(series):
        '''
        Interqantile range according to wikipedia
        https://en.wikipedia.org/wiki/Interquartile_range
        '''
        q_1 = series.quantile(0.25)
        q_3 = series.quantile(0.75)
        return q_3 - q_1

        # load data from text file
        data = pd.read_csv(
        'Week7.txt',
        delimiter=';',
        decimal=',')

        # define what columns should be treated and what we want to get from data
        treatment = {
        'sellerRating': ['mean', 'std', 'median', iqr],
        'Duration': ['mean', 'std', 'median', iqr],
        'ClosePrice': ['mean', 'std'],
        'OpenPrice': ['mean', 'std']}

        # calculations - a single line with pandas
        result = data.groupby('Category').agg(treatment)


        To see the actual result you may use, for example, print(result.loc[:, 'ClosePrice']) which gives:



                                    mean         std
        Category
        Antique/Art/Craft 14.854783 28.759996
        Automotive 108.915816 196.991289
        Books 36.636818 97.639969
        Business/Industrial 16.514000 18.968667
        Clothing/Accessories 35.880678 48.256547
        Coins/Stamps 30.408182 48.803024
        Collectibles 41.546339 121.047183
        Computer 73.247143 119.775838
        Electronics 74.777273 81.107397
        EverythingElse 45.026667 64.859379
        Health/Beauty 10.199697 9.922211
        Home/Garden 30.356250 78.543460
        Jewelry 30.271364 111.316375
        Music/Movie/Game 8.363206 9.836130
        Photography 447.956667 453.364402
        Pottery/Glass 64.574286 61.948767
        SportingGoods 74.090282 87.198602
        Toys/Hobbies 23.030650 67.269334





        share|improve this answer












        This is how I would approach the problem with Python 3 and pandas:



        import pandas as pd

        # function to compute quantile range
        def iqr(series):
        '''
        Interqantile range according to wikipedia
        https://en.wikipedia.org/wiki/Interquartile_range
        '''
        q_1 = series.quantile(0.25)
        q_3 = series.quantile(0.75)
        return q_3 - q_1

        # load data from text file
        data = pd.read_csv(
        'Week7.txt',
        delimiter=';',
        decimal=',')

        # define what columns should be treated and what we want to get from data
        treatment = {
        'sellerRating': ['mean', 'std', 'median', iqr],
        'Duration': ['mean', 'std', 'median', iqr],
        'ClosePrice': ['mean', 'std'],
        'OpenPrice': ['mean', 'std']}

        # calculations - a single line with pandas
        result = data.groupby('Category').agg(treatment)


        To see the actual result you may use, for example, print(result.loc[:, 'ClosePrice']) which gives:



                                    mean         std
        Category
        Antique/Art/Craft 14.854783 28.759996
        Automotive 108.915816 196.991289
        Books 36.636818 97.639969
        Business/Industrial 16.514000 18.968667
        Clothing/Accessories 35.880678 48.256547
        Coins/Stamps 30.408182 48.803024
        Collectibles 41.546339 121.047183
        Computer 73.247143 119.775838
        Electronics 74.777273 81.107397
        EverythingElse 45.026667 64.859379
        Health/Beauty 10.199697 9.922211
        Home/Garden 30.356250 78.543460
        Jewelry 30.271364 111.316375
        Music/Movie/Game 8.363206 9.836130
        Photography 447.956667 453.364402
        Pottery/Glass 64.574286 61.948767
        SportingGoods 74.090282 87.198602
        Toys/Hobbies 23.030650 67.269334






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered 2 days ago









        Poolka

        1,314128




        1,314128















            Popular posts from this blog

            generate and download xml file after input submit (php and mysql) - JPK

            Angular Downloading a file using contenturl with Basic Authentication

            Can't read property showImagePicker of undefined in react native iOS