Why normal equation is getting wrong results with certain set of features?

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}

My normal equation implementation gives very accurate results both for one dimension and for some configurations of multidimensional problems, high degree included, but it is only able to solve equations where features are not multiplied by each other. As soon as I involve multiplication of features, the results are getting wrong.

Number of features is being assumed automatically, based on the input. Dimension is set manually. When dimension is set higher than in the estimated function, redundant weights are getting very small, so they won't affect final solution.

Example:

F1. x1 + x1^2 + x1^3 - solved properly

F2. x1 + x2 + x3 - solved properly

F3. x1 + x2 + x3 + x1^2 + x2^2 + x3^2 - solved properly

F4. x1 + x2 + x3 + x1^2 + x2^2 + x3^2 + x1*x2 + x2*x3 + x1*x3 - wrong results

Important:

If I try to solve F2 with normal equation configured for higher dimensions as in F3, the solution is still accurate, as the weights that are not necessary are just very small - about 10^-12.

BUT

When I try to use F4 configuration in F3 problem, redundant weights are not minimized as should be expected. All the function weights are very high, not matching the solution.

The first explanation I thought of is that the columns are dependent on each other then and this somehow disrupts normal equation. Unfortunately I do not see any other solution of such functions.

Weights are obtained by following equation. I know I could use numpy, but I needed to use some other way. Anyway it should not have any impact, as the solutions found for first 3 examples are perfect.

weights = matmult(getMatrixInverse(matmult(transposed_x, train_set_x)), 

matmult(transposed_x, expected_outputs))

I have prepared some example data:

1 2 2 22.28

4 2 3 37.97

5 3 3 49.17

6 4 6 66.94

where the last number is the correct output - sum of multiplications of inputs and weights. These were prepared with parameters (first value is threshold - not associated with any feature. It is 0 in the example.):
0 4.5 6.7 2.19

In F1 configuration, where approximation is found properly, the weights are about:
[[0.0], [4.499999999999982], [6.699999999999999], [2.190000000000026]]

where first value is threshold, just a value not associated with any feature.
Next 3 values are proper weights. Sum of multiplications gives always proper solution.

In F2 configuration, where approximation is found properly, the weights are about:
[[-9.094947017729282e-13], [4.500000000000059], [6.70000000000141], [2.1899999999996], [-1.8207657603852567e-14], [-3.410605131648481e-13], [-5.684341886080802e-14]]

where first value is threshold, just a value not associated with any feature. As it is not used in function it is very low. Next 3 values are proper weights, and the rest are redundant weights, minimized as expected. Sum of multiplications gives always proper solution.

In f3 configuration:

[[5924.943766734235], [-1322.797113801977], [-953.6679353156687], [-1461.4574242834683], [50.30783678925443], [36.12622388603999], [113.07309358853317], [65.41037286567017], [107.23349570429868], [105.6503964716471]]

Here even threshold is huge, not to mention redundant values. The weights which are necessary are also too big. Sum of multiplications is not even close to solution.

What may cause such problem? How to approximate functions which are described by multiplication of x using normal equation?

asked Jan 3 at 21:18

Manaslu

247

add a comment |

Example:

F1. x1 + x1^2 + x1^3 - solved properly

F2. x1 + x2 + x3 - solved properly

F3. x1 + x2 + x3 + x1^2 + x2^2 + x3^2 - solved properly

F4. x1 + x2 + x3 + x1^2 + x2^2 + x3^2 + x1*x2 + x2*x3 + x1*x3 - wrong results

Important:

If I try to solve F2 with normal equation configured for higher dimensions as in F3, the solution is still accurate, as the weights that are not necessary are just very small - about 10^-12.

BUT

When I try to use F4 configuration in F3 problem, redundant weights are not minimized as should be expected. All the function weights are very high, not matching the solution.

The first explanation I thought of is that the columns are dependent on each other then and this somehow disrupts normal equation. Unfortunately I do not see any other solution of such functions.

weights = matmult(getMatrixInverse(matmult(transposed_x, train_set_x)), 

matmult(transposed_x, expected_outputs))

I have prepared some example data:

1 2 2 22.28

4 2 3 37.97

5 3 3 49.17

6 4 6 66.94

In F1 configuration, where approximation is found properly, the weights are about:
[[0.0], [4.499999999999982], [6.699999999999999], [2.190000000000026]]

where first value is threshold, just a value not associated with any feature.
Next 3 values are proper weights. Sum of multiplications gives always proper solution.

In f3 configuration:

Here even threshold is huge, not to mention redundant values. The weights which are necessary are also too big. Sum of multiplications is not even close to solution.

What may cause such problem? How to approximate functions which are described by multiplication of x using normal equation?

asked Jan 3 at 21:18

Manaslu

247

add a comment |

Example:

F1. x1 + x1^2 + x1^3 - solved properly

F2. x1 + x2 + x3 - solved properly

F3. x1 + x2 + x3 + x1^2 + x2^2 + x3^2 - solved properly

F4. x1 + x2 + x3 + x1^2 + x2^2 + x3^2 + x1*x2 + x2*x3 + x1*x3 - wrong results

Important:

If I try to solve F2 with normal equation configured for higher dimensions as in F3, the solution is still accurate, as the weights that are not necessary are just very small - about 10^-12.

BUT

When I try to use F4 configuration in F3 problem, redundant weights are not minimized as should be expected. All the function weights are very high, not matching the solution.

The first explanation I thought of is that the columns are dependent on each other then and this somehow disrupts normal equation. Unfortunately I do not see any other solution of such functions.

weights = matmult(getMatrixInverse(matmult(transposed_x, train_set_x)), 

matmult(transposed_x, expected_outputs))

I have prepared some example data:

1 2 2 22.28

4 2 3 37.97

5 3 3 49.17

6 4 6 66.94

In F1 configuration, where approximation is found properly, the weights are about:
[[0.0], [4.499999999999982], [6.699999999999999], [2.190000000000026]]

where first value is threshold, just a value not associated with any feature.
Next 3 values are proper weights. Sum of multiplications gives always proper solution.

In f3 configuration:

Here even threshold is huge, not to mention redundant values. The weights which are necessary are also too big. Sum of multiplications is not even close to solution.

What may cause such problem? How to approximate functions which are described by multiplication of x using normal equation?

asked Jan 3 at 21:18

Manaslu

247

Example:

F1. x1 + x1^2 + x1^3 - solved properly

F2. x1 + x2 + x3 - solved properly

F3. x1 + x2 + x3 + x1^2 + x2^2 + x3^2 - solved properly

F4. x1 + x2 + x3 + x1^2 + x2^2 + x3^2 + x1*x2 + x2*x3 + x1*x3 - wrong results

Important:

If I try to solve F2 with normal equation configured for higher dimensions as in F3, the solution is still accurate, as the weights that are not necessary are just very small - about 10^-12.

BUT

When I try to use F4 configuration in F3 problem, redundant weights are not minimized as should be expected. All the function weights are very high, not matching the solution.

The first explanation I thought of is that the columns are dependent on each other then and this somehow disrupts normal equation. Unfortunately I do not see any other solution of such functions.

weights = matmult(getMatrixInverse(matmult(transposed_x, train_set_x)), 

matmult(transposed_x, expected_outputs))

I have prepared some example data:

1 2 2 22.28

4 2 3 37.97

5 3 3 49.17

6 4 6 66.94

In F1 configuration, where approximation is found properly, the weights are about:
[[0.0], [4.499999999999982], [6.699999999999999], [2.190000000000026]]

where first value is threshold, just a value not associated with any feature.
Next 3 values are proper weights. Sum of multiplications gives always proper solution.

In f3 configuration:

Here even threshold is huge, not to mention redundant values. The weights which are necessary are also too big. Sum of multiplications is not even close to solution.

What may cause such problem? How to approximate functions which are described by multiplication of x using normal equation?

python machine-learning linear-regression

asked Jan 3 at 21:18

Manaslu

247

asked Jan 3 at 21:18

Manaslu

247

asked Jan 3 at 21:18

Manaslu

247

asked Jan 3 at 21:18

Manaslu

247

asked Jan 3 at 21:18

Manaslu

247

add a comment |

0

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54029940%2fwhy-normal-equation-is-getting-wrong-results-with-certain-set-of-features%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

0

active

oldest

votes

0

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Bdtjtk