New column based off certain input parameter to select what columns to use - Python





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







0















Have a pandas dataframe that includes multiple columns of monthly finance data. I have an input of period that is specified by the person running the program. It's currently just saved as period like shown below within the code.



#coded into python
period = ?? (user adds this in from input screen)


I need to create another column of data that uses the input period number to perform a calculation of other columns.



enter image description here



So, in the above table I'd like to create a new column 'calculation' that depends on the period input. For example, if a period of 1 was used the following calc1 would be completed (with math actually done). Period = 2 - then calc2. Period = 3 - then calc3. I only need one column calculated depending on the period number but added three examples in below picture for example of how it'd work.



enter image description here



I can do this in SQL using case when. So using the input period then sum what columns I need to.



select  Account #,
'&Period' AS Period,
'&Year' AS YR,
case
When '&Period' = '1' then sum(d_cf+d_1)
when '&Period' = '2' then sum(d_cf+d_1+d_2)
when '&Period' = '3' then sum(d_cf+d_1+d_2+d_3)


I am unsure on how to do this easily in python (newer learner). Yes, I could create a column that does each calculation via new column for every possible period (1-12), and then only select that column but I'd like to learn and do it a more efficient way.



Can you help more or point me in a better direction?










share|improve this question





























    0















    Have a pandas dataframe that includes multiple columns of monthly finance data. I have an input of period that is specified by the person running the program. It's currently just saved as period like shown below within the code.



    #coded into python
    period = ?? (user adds this in from input screen)


    I need to create another column of data that uses the input period number to perform a calculation of other columns.



    enter image description here



    So, in the above table I'd like to create a new column 'calculation' that depends on the period input. For example, if a period of 1 was used the following calc1 would be completed (with math actually done). Period = 2 - then calc2. Period = 3 - then calc3. I only need one column calculated depending on the period number but added three examples in below picture for example of how it'd work.



    enter image description here



    I can do this in SQL using case when. So using the input period then sum what columns I need to.



    select  Account #,
    '&Period' AS Period,
    '&Year' AS YR,
    case
    When '&Period' = '1' then sum(d_cf+d_1)
    when '&Period' = '2' then sum(d_cf+d_1+d_2)
    when '&Period' = '3' then sum(d_cf+d_1+d_2+d_3)


    I am unsure on how to do this easily in python (newer learner). Yes, I could create a column that does each calculation via new column for every possible period (1-12), and then only select that column but I'd like to learn and do it a more efficient way.



    Can you help more or point me in a better direction?










    share|improve this question

























      0












      0








      0








      Have a pandas dataframe that includes multiple columns of monthly finance data. I have an input of period that is specified by the person running the program. It's currently just saved as period like shown below within the code.



      #coded into python
      period = ?? (user adds this in from input screen)


      I need to create another column of data that uses the input period number to perform a calculation of other columns.



      enter image description here



      So, in the above table I'd like to create a new column 'calculation' that depends on the period input. For example, if a period of 1 was used the following calc1 would be completed (with math actually done). Period = 2 - then calc2. Period = 3 - then calc3. I only need one column calculated depending on the period number but added three examples in below picture for example of how it'd work.



      enter image description here



      I can do this in SQL using case when. So using the input period then sum what columns I need to.



      select  Account #,
      '&Period' AS Period,
      '&Year' AS YR,
      case
      When '&Period' = '1' then sum(d_cf+d_1)
      when '&Period' = '2' then sum(d_cf+d_1+d_2)
      when '&Period' = '3' then sum(d_cf+d_1+d_2+d_3)


      I am unsure on how to do this easily in python (newer learner). Yes, I could create a column that does each calculation via new column for every possible period (1-12), and then only select that column but I'd like to learn and do it a more efficient way.



      Can you help more or point me in a better direction?










      share|improve this question














      Have a pandas dataframe that includes multiple columns of monthly finance data. I have an input of period that is specified by the person running the program. It's currently just saved as period like shown below within the code.



      #coded into python
      period = ?? (user adds this in from input screen)


      I need to create another column of data that uses the input period number to perform a calculation of other columns.



      enter image description here



      So, in the above table I'd like to create a new column 'calculation' that depends on the period input. For example, if a period of 1 was used the following calc1 would be completed (with math actually done). Period = 2 - then calc2. Period = 3 - then calc3. I only need one column calculated depending on the period number but added three examples in below picture for example of how it'd work.



      enter image description here



      I can do this in SQL using case when. So using the input period then sum what columns I need to.



      select  Account #,
      '&Period' AS Period,
      '&Year' AS YR,
      case
      When '&Period' = '1' then sum(d_cf+d_1)
      when '&Period' = '2' then sum(d_cf+d_1+d_2)
      when '&Period' = '3' then sum(d_cf+d_1+d_2+d_3)


      I am unsure on how to do this easily in python (newer learner). Yes, I could create a column that does each calculation via new column for every possible period (1-12), and then only select that column but I'd like to learn and do it a more efficient way.



      Can you help more or point me in a better direction?







      python python-3.x pandas






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Jan 4 at 18:40









      rgh_dsargh_dsa

      476




      476
























          3 Answers
          3






          active

          oldest

          votes


















          1














          You could certainly do something like



          df[['d_cf'] + [f'd_{i}' for i in range(1, period+1)]].sum(axis=1)





          share|improve this answer































            1














            You can do this using a simple function in python:



            def get_calculation(df, period=NULL): 

            '''
            df = pandas data frame
            period = integer type
            '''

            if period == 1:
            return df.apply(lambda x: x['d_0'] +x['d_1'], axis=1)

            if period == 2:
            return df.apply(lambda x: x['d_0'] +x['d_1']+ x['d_2'], axis=1)

            if period == 3:
            return df.apply(lambda x: x['d_0'] +x['d_1']+ x['d_2'] + x['d_3'], axis=1)

            new_df = get_calculation(df, period = 1)


            Setup:



            df = pd.DataFrame({'d_0':list(range(1,7)),
            'd_1': list(range(10,70,10)),
            'd_2':list(range(100,700,100)),
            'd_3': list(range(1000,7000,1000))})





            share|improve this answer































              1














              Setup:



              import pandas as pd

              ddict = {
              'Year':['2018','2018','2018','2018','2018',],
              'Account_Num':['1111','1122','1133','1144','1155'],
              'd_cf':['1','2','3','4','5'],
              }

              data = pd.DataFrame(ddict)


              Create value calculator:



              def get_calcs(period):
              # Convert period to integer
              s = str(period)

              # Convert to string value
              n = int(period) + 1

              # This will repeat the period number by the value of the period number
              return ''.join([i * n for i in s])


              Main function copies data frame, iterates through period values, and sets calculated values to the correct spot index-wise for each relevant column:



              def process_data(data_frame=data, period_column='d_cf'):
              # Copy data_frame argument
              df = data_frame.copy(deep=True)

              # Run through each value in our period column
              for i in df[period_column].values.tolist():

              # Create a temporary column
              new_column = 'd_{}'.format(i)

              # Pass the period into our calculator; Capture the result
              calculated_value = get_calcs(i)

              # Create a new column based on our period number
              df[new_column] = ''

              # Use indexing to place the calculated value into our desired location
              df.loc[df[period_column] == i, new_column] = calculated_value

              # Return the result
              return df


              Start:



                 Year Account_Num d_cf
              0 2018 1111 1
              1 2018 1122 2
              2 2018 1133 3
              3 2018 1144 4
              4 2018 1155 5


              Result:



              process_data(data)

              Year Account_Num d_cf d_1 d_2 d_3 d_4 d_5
              0 2018 1111 1 11
              1 2018 1122 2 222
              2 2018 1133 3 3333
              3 2018 1144 4 44444
              4 2018 1155 5 555555





              share|improve this answer


























                Your Answer






                StackExchange.ifUsing("editor", function () {
                StackExchange.using("externalEditor", function () {
                StackExchange.using("snippets", function () {
                StackExchange.snippets.init();
                });
                });
                }, "code-snippets");

                StackExchange.ready(function() {
                var channelOptions = {
                tags: "".split(" "),
                id: "1"
                };
                initTagRenderer("".split(" "), "".split(" "), channelOptions);

                StackExchange.using("externalEditor", function() {
                // Have to fire editor after snippets, if snippets enabled
                if (StackExchange.settings.snippets.snippetsEnabled) {
                StackExchange.using("snippets", function() {
                createEditor();
                });
                }
                else {
                createEditor();
                }
                });

                function createEditor() {
                StackExchange.prepareEditor({
                heartbeatType: 'answer',
                autoActivateHeartbeat: false,
                convertImagesToLinks: true,
                noModals: true,
                showLowRepImageUploadWarning: true,
                reputationToPostImages: 10,
                bindNavPrevention: true,
                postfix: "",
                imageUploader: {
                brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                allowUrls: true
                },
                onDemand: true,
                discardSelector: ".discard-answer"
                ,immediatelyShowMarkdownHelp:true
                });


                }
                });














                draft saved

                draft discarded


















                StackExchange.ready(
                function () {
                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54044392%2fnew-column-based-off-certain-input-parameter-to-select-what-columns-to-use-pyt%23new-answer', 'question_page');
                }
                );

                Post as a guest















                Required, but never shown

























                3 Answers
                3






                active

                oldest

                votes








                3 Answers
                3






                active

                oldest

                votes









                active

                oldest

                votes






                active

                oldest

                votes









                1














                You could certainly do something like



                df[['d_cf'] + [f'd_{i}' for i in range(1, period+1)]].sum(axis=1)





                share|improve this answer




























                  1














                  You could certainly do something like



                  df[['d_cf'] + [f'd_{i}' for i in range(1, period+1)]].sum(axis=1)





                  share|improve this answer


























                    1












                    1








                    1







                    You could certainly do something like



                    df[['d_cf'] + [f'd_{i}' for i in range(1, period+1)]].sum(axis=1)





                    share|improve this answer













                    You could certainly do something like



                    df[['d_cf'] + [f'd_{i}' for i in range(1, period+1)]].sum(axis=1)






                    share|improve this answer












                    share|improve this answer



                    share|improve this answer










                    answered Jan 4 at 19:07









                    N.ClarkeN.Clarke

                    1985




                    1985

























                        1














                        You can do this using a simple function in python:



                        def get_calculation(df, period=NULL): 

                        '''
                        df = pandas data frame
                        period = integer type
                        '''

                        if period == 1:
                        return df.apply(lambda x: x['d_0'] +x['d_1'], axis=1)

                        if period == 2:
                        return df.apply(lambda x: x['d_0'] +x['d_1']+ x['d_2'], axis=1)

                        if period == 3:
                        return df.apply(lambda x: x['d_0'] +x['d_1']+ x['d_2'] + x['d_3'], axis=1)

                        new_df = get_calculation(df, period = 1)


                        Setup:



                        df = pd.DataFrame({'d_0':list(range(1,7)),
                        'd_1': list(range(10,70,10)),
                        'd_2':list(range(100,700,100)),
                        'd_3': list(range(1000,7000,1000))})





                        share|improve this answer




























                          1














                          You can do this using a simple function in python:



                          def get_calculation(df, period=NULL): 

                          '''
                          df = pandas data frame
                          period = integer type
                          '''

                          if period == 1:
                          return df.apply(lambda x: x['d_0'] +x['d_1'], axis=1)

                          if period == 2:
                          return df.apply(lambda x: x['d_0'] +x['d_1']+ x['d_2'], axis=1)

                          if period == 3:
                          return df.apply(lambda x: x['d_0'] +x['d_1']+ x['d_2'] + x['d_3'], axis=1)

                          new_df = get_calculation(df, period = 1)


                          Setup:



                          df = pd.DataFrame({'d_0':list(range(1,7)),
                          'd_1': list(range(10,70,10)),
                          'd_2':list(range(100,700,100)),
                          'd_3': list(range(1000,7000,1000))})





                          share|improve this answer


























                            1












                            1








                            1







                            You can do this using a simple function in python:



                            def get_calculation(df, period=NULL): 

                            '''
                            df = pandas data frame
                            period = integer type
                            '''

                            if period == 1:
                            return df.apply(lambda x: x['d_0'] +x['d_1'], axis=1)

                            if period == 2:
                            return df.apply(lambda x: x['d_0'] +x['d_1']+ x['d_2'], axis=1)

                            if period == 3:
                            return df.apply(lambda x: x['d_0'] +x['d_1']+ x['d_2'] + x['d_3'], axis=1)

                            new_df = get_calculation(df, period = 1)


                            Setup:



                            df = pd.DataFrame({'d_0':list(range(1,7)),
                            'd_1': list(range(10,70,10)),
                            'd_2':list(range(100,700,100)),
                            'd_3': list(range(1000,7000,1000))})





                            share|improve this answer













                            You can do this using a simple function in python:



                            def get_calculation(df, period=NULL): 

                            '''
                            df = pandas data frame
                            period = integer type
                            '''

                            if period == 1:
                            return df.apply(lambda x: x['d_0'] +x['d_1'], axis=1)

                            if period == 2:
                            return df.apply(lambda x: x['d_0'] +x['d_1']+ x['d_2'], axis=1)

                            if period == 3:
                            return df.apply(lambda x: x['d_0'] +x['d_1']+ x['d_2'] + x['d_3'], axis=1)

                            new_df = get_calculation(df, period = 1)


                            Setup:



                            df = pd.DataFrame({'d_0':list(range(1,7)),
                            'd_1': list(range(10,70,10)),
                            'd_2':list(range(100,700,100)),
                            'd_3': list(range(1000,7000,1000))})






                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered Jan 4 at 19:06









                            YOLOYOLO

                            5,5731525




                            5,5731525























                                1














                                Setup:



                                import pandas as pd

                                ddict = {
                                'Year':['2018','2018','2018','2018','2018',],
                                'Account_Num':['1111','1122','1133','1144','1155'],
                                'd_cf':['1','2','3','4','5'],
                                }

                                data = pd.DataFrame(ddict)


                                Create value calculator:



                                def get_calcs(period):
                                # Convert period to integer
                                s = str(period)

                                # Convert to string value
                                n = int(period) + 1

                                # This will repeat the period number by the value of the period number
                                return ''.join([i * n for i in s])


                                Main function copies data frame, iterates through period values, and sets calculated values to the correct spot index-wise for each relevant column:



                                def process_data(data_frame=data, period_column='d_cf'):
                                # Copy data_frame argument
                                df = data_frame.copy(deep=True)

                                # Run through each value in our period column
                                for i in df[period_column].values.tolist():

                                # Create a temporary column
                                new_column = 'd_{}'.format(i)

                                # Pass the period into our calculator; Capture the result
                                calculated_value = get_calcs(i)

                                # Create a new column based on our period number
                                df[new_column] = ''

                                # Use indexing to place the calculated value into our desired location
                                df.loc[df[period_column] == i, new_column] = calculated_value

                                # Return the result
                                return df


                                Start:



                                   Year Account_Num d_cf
                                0 2018 1111 1
                                1 2018 1122 2
                                2 2018 1133 3
                                3 2018 1144 4
                                4 2018 1155 5


                                Result:



                                process_data(data)

                                Year Account_Num d_cf d_1 d_2 d_3 d_4 d_5
                                0 2018 1111 1 11
                                1 2018 1122 2 222
                                2 2018 1133 3 3333
                                3 2018 1144 4 44444
                                4 2018 1155 5 555555





                                share|improve this answer






























                                  1














                                  Setup:



                                  import pandas as pd

                                  ddict = {
                                  'Year':['2018','2018','2018','2018','2018',],
                                  'Account_Num':['1111','1122','1133','1144','1155'],
                                  'd_cf':['1','2','3','4','5'],
                                  }

                                  data = pd.DataFrame(ddict)


                                  Create value calculator:



                                  def get_calcs(period):
                                  # Convert period to integer
                                  s = str(period)

                                  # Convert to string value
                                  n = int(period) + 1

                                  # This will repeat the period number by the value of the period number
                                  return ''.join([i * n for i in s])


                                  Main function copies data frame, iterates through period values, and sets calculated values to the correct spot index-wise for each relevant column:



                                  def process_data(data_frame=data, period_column='d_cf'):
                                  # Copy data_frame argument
                                  df = data_frame.copy(deep=True)

                                  # Run through each value in our period column
                                  for i in df[period_column].values.tolist():

                                  # Create a temporary column
                                  new_column = 'd_{}'.format(i)

                                  # Pass the period into our calculator; Capture the result
                                  calculated_value = get_calcs(i)

                                  # Create a new column based on our period number
                                  df[new_column] = ''

                                  # Use indexing to place the calculated value into our desired location
                                  df.loc[df[period_column] == i, new_column] = calculated_value

                                  # Return the result
                                  return df


                                  Start:



                                     Year Account_Num d_cf
                                  0 2018 1111 1
                                  1 2018 1122 2
                                  2 2018 1133 3
                                  3 2018 1144 4
                                  4 2018 1155 5


                                  Result:



                                  process_data(data)

                                  Year Account_Num d_cf d_1 d_2 d_3 d_4 d_5
                                  0 2018 1111 1 11
                                  1 2018 1122 2 222
                                  2 2018 1133 3 3333
                                  3 2018 1144 4 44444
                                  4 2018 1155 5 555555





                                  share|improve this answer




























                                    1












                                    1








                                    1







                                    Setup:



                                    import pandas as pd

                                    ddict = {
                                    'Year':['2018','2018','2018','2018','2018',],
                                    'Account_Num':['1111','1122','1133','1144','1155'],
                                    'd_cf':['1','2','3','4','5'],
                                    }

                                    data = pd.DataFrame(ddict)


                                    Create value calculator:



                                    def get_calcs(period):
                                    # Convert period to integer
                                    s = str(period)

                                    # Convert to string value
                                    n = int(period) + 1

                                    # This will repeat the period number by the value of the period number
                                    return ''.join([i * n for i in s])


                                    Main function copies data frame, iterates through period values, and sets calculated values to the correct spot index-wise for each relevant column:



                                    def process_data(data_frame=data, period_column='d_cf'):
                                    # Copy data_frame argument
                                    df = data_frame.copy(deep=True)

                                    # Run through each value in our period column
                                    for i in df[period_column].values.tolist():

                                    # Create a temporary column
                                    new_column = 'd_{}'.format(i)

                                    # Pass the period into our calculator; Capture the result
                                    calculated_value = get_calcs(i)

                                    # Create a new column based on our period number
                                    df[new_column] = ''

                                    # Use indexing to place the calculated value into our desired location
                                    df.loc[df[period_column] == i, new_column] = calculated_value

                                    # Return the result
                                    return df


                                    Start:



                                       Year Account_Num d_cf
                                    0 2018 1111 1
                                    1 2018 1122 2
                                    2 2018 1133 3
                                    3 2018 1144 4
                                    4 2018 1155 5


                                    Result:



                                    process_data(data)

                                    Year Account_Num d_cf d_1 d_2 d_3 d_4 d_5
                                    0 2018 1111 1 11
                                    1 2018 1122 2 222
                                    2 2018 1133 3 3333
                                    3 2018 1144 4 44444
                                    4 2018 1155 5 555555





                                    share|improve this answer















                                    Setup:



                                    import pandas as pd

                                    ddict = {
                                    'Year':['2018','2018','2018','2018','2018',],
                                    'Account_Num':['1111','1122','1133','1144','1155'],
                                    'd_cf':['1','2','3','4','5'],
                                    }

                                    data = pd.DataFrame(ddict)


                                    Create value calculator:



                                    def get_calcs(period):
                                    # Convert period to integer
                                    s = str(period)

                                    # Convert to string value
                                    n = int(period) + 1

                                    # This will repeat the period number by the value of the period number
                                    return ''.join([i * n for i in s])


                                    Main function copies data frame, iterates through period values, and sets calculated values to the correct spot index-wise for each relevant column:



                                    def process_data(data_frame=data, period_column='d_cf'):
                                    # Copy data_frame argument
                                    df = data_frame.copy(deep=True)

                                    # Run through each value in our period column
                                    for i in df[period_column].values.tolist():

                                    # Create a temporary column
                                    new_column = 'd_{}'.format(i)

                                    # Pass the period into our calculator; Capture the result
                                    calculated_value = get_calcs(i)

                                    # Create a new column based on our period number
                                    df[new_column] = ''

                                    # Use indexing to place the calculated value into our desired location
                                    df.loc[df[period_column] == i, new_column] = calculated_value

                                    # Return the result
                                    return df


                                    Start:



                                       Year Account_Num d_cf
                                    0 2018 1111 1
                                    1 2018 1122 2
                                    2 2018 1133 3
                                    3 2018 1144 4
                                    4 2018 1155 5


                                    Result:



                                    process_data(data)

                                    Year Account_Num d_cf d_1 d_2 d_3 d_4 d_5
                                    0 2018 1111 1 11
                                    1 2018 1122 2 222
                                    2 2018 1133 3 3333
                                    3 2018 1144 4 44444
                                    4 2018 1155 5 555555






                                    share|improve this answer














                                    share|improve this answer



                                    share|improve this answer








                                    edited Jan 4 at 19:38

























                                    answered Jan 4 at 19:26









                                    Mark MorettoMark Moretto

                                    38428




                                    38428






























                                        draft saved

                                        draft discarded




















































                                        Thanks for contributing an answer to Stack Overflow!


                                        • Please be sure to answer the question. Provide details and share your research!

                                        But avoid



                                        • Asking for help, clarification, or responding to other answers.

                                        • Making statements based on opinion; back them up with references or personal experience.


                                        To learn more, see our tips on writing great answers.




                                        draft saved


                                        draft discarded














                                        StackExchange.ready(
                                        function () {
                                        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54044392%2fnew-column-based-off-certain-input-parameter-to-select-what-columns-to-use-pyt%23new-answer', 'question_page');
                                        }
                                        );

                                        Post as a guest















                                        Required, but never shown





















































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown

































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown







                                        Popular posts from this blog

                                        Monofisismo

                                        Angular Downloading a file using contenturl with Basic Authentication

                                        Olmecas