Why normal equation is getting wrong results with certain set of features?





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







0















My normal equation implementation gives very accurate results both for one dimension and for some configurations of multidimensional problems, high degree included, but it is only able to solve equations where features are not multiplied by each other. As soon as I involve multiplication of features, the results are getting wrong.



Number of features is being assumed automatically, based on the input. Dimension is set manually. When dimension is set higher than in the estimated function, redundant weights are getting very small, so they won't affect final solution.



Example:



F1. x1 + x1^2 + x1^3 - solved properly



F2. x1 + x2 + x3 - solved properly



F3. x1 + x2 + x3 + x1^2 + x2^2 + x3^2 - solved properly



F4. x1 + x2 + x3 + x1^2 + x2^2 + x3^2 + x1*x2 + x2*x3 + x1*x3 - wrong results



Important:



If I try to solve F2 with normal equation configured for higher dimensions as in F3, the solution is still accurate, as the weights that are not necessary are just very small - about 10^-12.



BUT



When I try to use F4 configuration in F3 problem, redundant weights are not minimized as should be expected. All the function weights are very high, not matching the solution.



The first explanation I thought of is that the columns are dependent on each other then and this somehow disrupts normal equation. Unfortunately I do not see any other solution of such functions.



Weights are obtained by following equation. I know I could use numpy, but I needed to use some other way. Anyway it should not have any impact, as the solutions found for first 3 examples are perfect.



weights = matmult(getMatrixInverse(matmult(transposed_x, train_set_x)), 
matmult(transposed_x, expected_outputs))


I have prepared some example data:



1 2 2 22.28



4 2 3 37.97



5 3 3 49.17



6 4 6 66.94



where the last number is the correct output - sum of multiplications of inputs and weights. These were prepared with parameters (first value is threshold - not associated with any feature. It is 0 in the example.):
0 4.5 6.7 2.19



In F1 configuration, where approximation is found properly, the weights are about:
[[0.0], [4.499999999999982], [6.699999999999999], [2.190000000000026]]



where first value is threshold, just a value not associated with any feature.
Next 3 values are proper weights. Sum of multiplications gives always proper solution.



In F2 configuration, where approximation is found properly, the weights are about:
[[-9.094947017729282e-13], [4.500000000000059], [6.70000000000141], [2.1899999999996], [-1.8207657603852567e-14], [-3.410605131648481e-13], [-5.684341886080802e-14]]



where first value is threshold, just a value not associated with any feature. As it is not used in function it is very low. Next 3 values are proper weights, and the rest are redundant weights, minimized as expected. Sum of multiplications gives always proper solution.



In f3 configuration:



[[5924.943766734235], [-1322.797113801977], [-953.6679353156687], [-1461.4574242834683], [50.30783678925443], [36.12622388603999], [113.07309358853317], [65.41037286567017], [107.23349570429868], [105.6503964716471]]



Here even threshold is huge, not to mention redundant values. The weights which are necessary are also too big. Sum of multiplications is not even close to solution.



What may cause such problem? How to approximate functions which are described by multiplication of x using normal equation?










share|improve this question





























    0















    My normal equation implementation gives very accurate results both for one dimension and for some configurations of multidimensional problems, high degree included, but it is only able to solve equations where features are not multiplied by each other. As soon as I involve multiplication of features, the results are getting wrong.



    Number of features is being assumed automatically, based on the input. Dimension is set manually. When dimension is set higher than in the estimated function, redundant weights are getting very small, so they won't affect final solution.



    Example:



    F1. x1 + x1^2 + x1^3 - solved properly



    F2. x1 + x2 + x3 - solved properly



    F3. x1 + x2 + x3 + x1^2 + x2^2 + x3^2 - solved properly



    F4. x1 + x2 + x3 + x1^2 + x2^2 + x3^2 + x1*x2 + x2*x3 + x1*x3 - wrong results



    Important:



    If I try to solve F2 with normal equation configured for higher dimensions as in F3, the solution is still accurate, as the weights that are not necessary are just very small - about 10^-12.



    BUT



    When I try to use F4 configuration in F3 problem, redundant weights are not minimized as should be expected. All the function weights are very high, not matching the solution.



    The first explanation I thought of is that the columns are dependent on each other then and this somehow disrupts normal equation. Unfortunately I do not see any other solution of such functions.



    Weights are obtained by following equation. I know I could use numpy, but I needed to use some other way. Anyway it should not have any impact, as the solutions found for first 3 examples are perfect.



    weights = matmult(getMatrixInverse(matmult(transposed_x, train_set_x)), 
    matmult(transposed_x, expected_outputs))


    I have prepared some example data:



    1 2 2 22.28



    4 2 3 37.97



    5 3 3 49.17



    6 4 6 66.94



    where the last number is the correct output - sum of multiplications of inputs and weights. These were prepared with parameters (first value is threshold - not associated with any feature. It is 0 in the example.):
    0 4.5 6.7 2.19



    In F1 configuration, where approximation is found properly, the weights are about:
    [[0.0], [4.499999999999982], [6.699999999999999], [2.190000000000026]]



    where first value is threshold, just a value not associated with any feature.
    Next 3 values are proper weights. Sum of multiplications gives always proper solution.



    In F2 configuration, where approximation is found properly, the weights are about:
    [[-9.094947017729282e-13], [4.500000000000059], [6.70000000000141], [2.1899999999996], [-1.8207657603852567e-14], [-3.410605131648481e-13], [-5.684341886080802e-14]]



    where first value is threshold, just a value not associated with any feature. As it is not used in function it is very low. Next 3 values are proper weights, and the rest are redundant weights, minimized as expected. Sum of multiplications gives always proper solution.



    In f3 configuration:



    [[5924.943766734235], [-1322.797113801977], [-953.6679353156687], [-1461.4574242834683], [50.30783678925443], [36.12622388603999], [113.07309358853317], [65.41037286567017], [107.23349570429868], [105.6503964716471]]



    Here even threshold is huge, not to mention redundant values. The weights which are necessary are also too big. Sum of multiplications is not even close to solution.



    What may cause such problem? How to approximate functions which are described by multiplication of x using normal equation?










    share|improve this question

























      0












      0








      0








      My normal equation implementation gives very accurate results both for one dimension and for some configurations of multidimensional problems, high degree included, but it is only able to solve equations where features are not multiplied by each other. As soon as I involve multiplication of features, the results are getting wrong.



      Number of features is being assumed automatically, based on the input. Dimension is set manually. When dimension is set higher than in the estimated function, redundant weights are getting very small, so they won't affect final solution.



      Example:



      F1. x1 + x1^2 + x1^3 - solved properly



      F2. x1 + x2 + x3 - solved properly



      F3. x1 + x2 + x3 + x1^2 + x2^2 + x3^2 - solved properly



      F4. x1 + x2 + x3 + x1^2 + x2^2 + x3^2 + x1*x2 + x2*x3 + x1*x3 - wrong results



      Important:



      If I try to solve F2 with normal equation configured for higher dimensions as in F3, the solution is still accurate, as the weights that are not necessary are just very small - about 10^-12.



      BUT



      When I try to use F4 configuration in F3 problem, redundant weights are not minimized as should be expected. All the function weights are very high, not matching the solution.



      The first explanation I thought of is that the columns are dependent on each other then and this somehow disrupts normal equation. Unfortunately I do not see any other solution of such functions.



      Weights are obtained by following equation. I know I could use numpy, but I needed to use some other way. Anyway it should not have any impact, as the solutions found for first 3 examples are perfect.



      weights = matmult(getMatrixInverse(matmult(transposed_x, train_set_x)), 
      matmult(transposed_x, expected_outputs))


      I have prepared some example data:



      1 2 2 22.28



      4 2 3 37.97



      5 3 3 49.17



      6 4 6 66.94



      where the last number is the correct output - sum of multiplications of inputs and weights. These were prepared with parameters (first value is threshold - not associated with any feature. It is 0 in the example.):
      0 4.5 6.7 2.19



      In F1 configuration, where approximation is found properly, the weights are about:
      [[0.0], [4.499999999999982], [6.699999999999999], [2.190000000000026]]



      where first value is threshold, just a value not associated with any feature.
      Next 3 values are proper weights. Sum of multiplications gives always proper solution.



      In F2 configuration, where approximation is found properly, the weights are about:
      [[-9.094947017729282e-13], [4.500000000000059], [6.70000000000141], [2.1899999999996], [-1.8207657603852567e-14], [-3.410605131648481e-13], [-5.684341886080802e-14]]



      where first value is threshold, just a value not associated with any feature. As it is not used in function it is very low. Next 3 values are proper weights, and the rest are redundant weights, minimized as expected. Sum of multiplications gives always proper solution.



      In f3 configuration:



      [[5924.943766734235], [-1322.797113801977], [-953.6679353156687], [-1461.4574242834683], [50.30783678925443], [36.12622388603999], [113.07309358853317], [65.41037286567017], [107.23349570429868], [105.6503964716471]]



      Here even threshold is huge, not to mention redundant values. The weights which are necessary are also too big. Sum of multiplications is not even close to solution.



      What may cause such problem? How to approximate functions which are described by multiplication of x using normal equation?










      share|improve this question














      My normal equation implementation gives very accurate results both for one dimension and for some configurations of multidimensional problems, high degree included, but it is only able to solve equations where features are not multiplied by each other. As soon as I involve multiplication of features, the results are getting wrong.



      Number of features is being assumed automatically, based on the input. Dimension is set manually. When dimension is set higher than in the estimated function, redundant weights are getting very small, so they won't affect final solution.



      Example:



      F1. x1 + x1^2 + x1^3 - solved properly



      F2. x1 + x2 + x3 - solved properly



      F3. x1 + x2 + x3 + x1^2 + x2^2 + x3^2 - solved properly



      F4. x1 + x2 + x3 + x1^2 + x2^2 + x3^2 + x1*x2 + x2*x3 + x1*x3 - wrong results



      Important:



      If I try to solve F2 with normal equation configured for higher dimensions as in F3, the solution is still accurate, as the weights that are not necessary are just very small - about 10^-12.



      BUT



      When I try to use F4 configuration in F3 problem, redundant weights are not minimized as should be expected. All the function weights are very high, not matching the solution.



      The first explanation I thought of is that the columns are dependent on each other then and this somehow disrupts normal equation. Unfortunately I do not see any other solution of such functions.



      Weights are obtained by following equation. I know I could use numpy, but I needed to use some other way. Anyway it should not have any impact, as the solutions found for first 3 examples are perfect.



      weights = matmult(getMatrixInverse(matmult(transposed_x, train_set_x)), 
      matmult(transposed_x, expected_outputs))


      I have prepared some example data:



      1 2 2 22.28



      4 2 3 37.97



      5 3 3 49.17



      6 4 6 66.94



      where the last number is the correct output - sum of multiplications of inputs and weights. These were prepared with parameters (first value is threshold - not associated with any feature. It is 0 in the example.):
      0 4.5 6.7 2.19



      In F1 configuration, where approximation is found properly, the weights are about:
      [[0.0], [4.499999999999982], [6.699999999999999], [2.190000000000026]]



      where first value is threshold, just a value not associated with any feature.
      Next 3 values are proper weights. Sum of multiplications gives always proper solution.



      In F2 configuration, where approximation is found properly, the weights are about:
      [[-9.094947017729282e-13], [4.500000000000059], [6.70000000000141], [2.1899999999996], [-1.8207657603852567e-14], [-3.410605131648481e-13], [-5.684341886080802e-14]]



      where first value is threshold, just a value not associated with any feature. As it is not used in function it is very low. Next 3 values are proper weights, and the rest are redundant weights, minimized as expected. Sum of multiplications gives always proper solution.



      In f3 configuration:



      [[5924.943766734235], [-1322.797113801977], [-953.6679353156687], [-1461.4574242834683], [50.30783678925443], [36.12622388603999], [113.07309358853317], [65.41037286567017], [107.23349570429868], [105.6503964716471]]



      Here even threshold is huge, not to mention redundant values. The weights which are necessary are also too big. Sum of multiplications is not even close to solution.



      What may cause such problem? How to approximate functions which are described by multiplication of x using normal equation?







      python machine-learning linear-regression






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Jan 3 at 21:18









      ManasluManaslu

      247




      247
























          0






          active

          oldest

          votes












          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54029940%2fwhy-normal-equation-is-getting-wrong-results-with-certain-set-of-features%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes
















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54029940%2fwhy-normal-equation-is-getting-wrong-results-with-certain-set-of-features%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Monofisismo

          Angular Downloading a file using contenturl with Basic Authentication

          Olmecas