Why normal equation is getting wrong results with certain set of features?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
My normal equation implementation gives very accurate results both for one dimension and for some configurations of multidimensional problems, high degree included, but it is only able to solve equations where features are not multiplied by each other. As soon as I involve multiplication of features, the results are getting wrong.
Number of features is being assumed automatically, based on the input. Dimension is set manually. When dimension is set higher than in the estimated function, redundant weights are getting very small, so they won't affect final solution.
Example:
F1. x1 + x1^2 + x1^3 - solved properly
F2. x1 + x2 + x3 - solved properly
F3. x1 + x2 + x3 + x1^2 + x2^2 + x3^2 - solved properly
F4. x1 + x2 + x3 + x1^2 + x2^2 + x3^2 + x1*x2 + x2*x3 + x1*x3 - wrong results
Important:
If I try to solve F2 with normal equation configured for higher dimensions as in F3, the solution is still accurate, as the weights that are not necessary are just very small - about 10^-12.
BUT
When I try to use F4 configuration in F3 problem, redundant weights are not minimized as should be expected. All the function weights are very high, not matching the solution.
The first explanation I thought of is that the columns are dependent on each other then and this somehow disrupts normal equation. Unfortunately I do not see any other solution of such functions.
Weights are obtained by following equation. I know I could use numpy, but I needed to use some other way. Anyway it should not have any impact, as the solutions found for first 3 examples are perfect.
weights = matmult(getMatrixInverse(matmult(transposed_x, train_set_x)),
matmult(transposed_x, expected_outputs))
I have prepared some example data:
1 2 2 22.28
4 2 3 37.97
5 3 3 49.17
6 4 6 66.94
where the last number is the correct output - sum of multiplications of inputs and weights. These were prepared with parameters (first value is threshold - not associated with any feature. It is 0 in the example.):
0 4.5 6.7 2.19
In F1 configuration, where approximation is found properly, the weights are about:
[[0.0], [4.499999999999982], [6.699999999999999], [2.190000000000026]]
where first value is threshold, just a value not associated with any feature.
Next 3 values are proper weights. Sum of multiplications gives always proper solution.
In F2 configuration, where approximation is found properly, the weights are about:
[[-9.094947017729282e-13], [4.500000000000059], [6.70000000000141], [2.1899999999996], [-1.8207657603852567e-14], [-3.410605131648481e-13], [-5.684341886080802e-14]]
where first value is threshold, just a value not associated with any feature. As it is not used in function it is very low. Next 3 values are proper weights, and the rest are redundant weights, minimized as expected. Sum of multiplications gives always proper solution.
In f3 configuration:
[[5924.943766734235], [-1322.797113801977], [-953.6679353156687], [-1461.4574242834683], [50.30783678925443], [36.12622388603999], [113.07309358853317], [65.41037286567017], [107.23349570429868], [105.6503964716471]]
Here even threshold is huge, not to mention redundant values. The weights which are necessary are also too big. Sum of multiplications is not even close to solution.
What may cause such problem? How to approximate functions which are described by multiplication of x using normal equation?
python machine-learning linear-regression
add a comment |
My normal equation implementation gives very accurate results both for one dimension and for some configurations of multidimensional problems, high degree included, but it is only able to solve equations where features are not multiplied by each other. As soon as I involve multiplication of features, the results are getting wrong.
Number of features is being assumed automatically, based on the input. Dimension is set manually. When dimension is set higher than in the estimated function, redundant weights are getting very small, so they won't affect final solution.
Example:
F1. x1 + x1^2 + x1^3 - solved properly
F2. x1 + x2 + x3 - solved properly
F3. x1 + x2 + x3 + x1^2 + x2^2 + x3^2 - solved properly
F4. x1 + x2 + x3 + x1^2 + x2^2 + x3^2 + x1*x2 + x2*x3 + x1*x3 - wrong results
Important:
If I try to solve F2 with normal equation configured for higher dimensions as in F3, the solution is still accurate, as the weights that are not necessary are just very small - about 10^-12.
BUT
When I try to use F4 configuration in F3 problem, redundant weights are not minimized as should be expected. All the function weights are very high, not matching the solution.
The first explanation I thought of is that the columns are dependent on each other then and this somehow disrupts normal equation. Unfortunately I do not see any other solution of such functions.
Weights are obtained by following equation. I know I could use numpy, but I needed to use some other way. Anyway it should not have any impact, as the solutions found for first 3 examples are perfect.
weights = matmult(getMatrixInverse(matmult(transposed_x, train_set_x)),
matmult(transposed_x, expected_outputs))
I have prepared some example data:
1 2 2 22.28
4 2 3 37.97
5 3 3 49.17
6 4 6 66.94
where the last number is the correct output - sum of multiplications of inputs and weights. These were prepared with parameters (first value is threshold - not associated with any feature. It is 0 in the example.):
0 4.5 6.7 2.19
In F1 configuration, where approximation is found properly, the weights are about:
[[0.0], [4.499999999999982], [6.699999999999999], [2.190000000000026]]
where first value is threshold, just a value not associated with any feature.
Next 3 values are proper weights. Sum of multiplications gives always proper solution.
In F2 configuration, where approximation is found properly, the weights are about:
[[-9.094947017729282e-13], [4.500000000000059], [6.70000000000141], [2.1899999999996], [-1.8207657603852567e-14], [-3.410605131648481e-13], [-5.684341886080802e-14]]
where first value is threshold, just a value not associated with any feature. As it is not used in function it is very low. Next 3 values are proper weights, and the rest are redundant weights, minimized as expected. Sum of multiplications gives always proper solution.
In f3 configuration:
[[5924.943766734235], [-1322.797113801977], [-953.6679353156687], [-1461.4574242834683], [50.30783678925443], [36.12622388603999], [113.07309358853317], [65.41037286567017], [107.23349570429868], [105.6503964716471]]
Here even threshold is huge, not to mention redundant values. The weights which are necessary are also too big. Sum of multiplications is not even close to solution.
What may cause such problem? How to approximate functions which are described by multiplication of x using normal equation?
python machine-learning linear-regression
add a comment |
My normal equation implementation gives very accurate results both for one dimension and for some configurations of multidimensional problems, high degree included, but it is only able to solve equations where features are not multiplied by each other. As soon as I involve multiplication of features, the results are getting wrong.
Number of features is being assumed automatically, based on the input. Dimension is set manually. When dimension is set higher than in the estimated function, redundant weights are getting very small, so they won't affect final solution.
Example:
F1. x1 + x1^2 + x1^3 - solved properly
F2. x1 + x2 + x3 - solved properly
F3. x1 + x2 + x3 + x1^2 + x2^2 + x3^2 - solved properly
F4. x1 + x2 + x3 + x1^2 + x2^2 + x3^2 + x1*x2 + x2*x3 + x1*x3 - wrong results
Important:
If I try to solve F2 with normal equation configured for higher dimensions as in F3, the solution is still accurate, as the weights that are not necessary are just very small - about 10^-12.
BUT
When I try to use F4 configuration in F3 problem, redundant weights are not minimized as should be expected. All the function weights are very high, not matching the solution.
The first explanation I thought of is that the columns are dependent on each other then and this somehow disrupts normal equation. Unfortunately I do not see any other solution of such functions.
Weights are obtained by following equation. I know I could use numpy, but I needed to use some other way. Anyway it should not have any impact, as the solutions found for first 3 examples are perfect.
weights = matmult(getMatrixInverse(matmult(transposed_x, train_set_x)),
matmult(transposed_x, expected_outputs))
I have prepared some example data:
1 2 2 22.28
4 2 3 37.97
5 3 3 49.17
6 4 6 66.94
where the last number is the correct output - sum of multiplications of inputs and weights. These were prepared with parameters (first value is threshold - not associated with any feature. It is 0 in the example.):
0 4.5 6.7 2.19
In F1 configuration, where approximation is found properly, the weights are about:
[[0.0], [4.499999999999982], [6.699999999999999], [2.190000000000026]]
where first value is threshold, just a value not associated with any feature.
Next 3 values are proper weights. Sum of multiplications gives always proper solution.
In F2 configuration, where approximation is found properly, the weights are about:
[[-9.094947017729282e-13], [4.500000000000059], [6.70000000000141], [2.1899999999996], [-1.8207657603852567e-14], [-3.410605131648481e-13], [-5.684341886080802e-14]]
where first value is threshold, just a value not associated with any feature. As it is not used in function it is very low. Next 3 values are proper weights, and the rest are redundant weights, minimized as expected. Sum of multiplications gives always proper solution.
In f3 configuration:
[[5924.943766734235], [-1322.797113801977], [-953.6679353156687], [-1461.4574242834683], [50.30783678925443], [36.12622388603999], [113.07309358853317], [65.41037286567017], [107.23349570429868], [105.6503964716471]]
Here even threshold is huge, not to mention redundant values. The weights which are necessary are also too big. Sum of multiplications is not even close to solution.
What may cause such problem? How to approximate functions which are described by multiplication of x using normal equation?
python machine-learning linear-regression
My normal equation implementation gives very accurate results both for one dimension and for some configurations of multidimensional problems, high degree included, but it is only able to solve equations where features are not multiplied by each other. As soon as I involve multiplication of features, the results are getting wrong.
Number of features is being assumed automatically, based on the input. Dimension is set manually. When dimension is set higher than in the estimated function, redundant weights are getting very small, so they won't affect final solution.
Example:
F1. x1 + x1^2 + x1^3 - solved properly
F2. x1 + x2 + x3 - solved properly
F3. x1 + x2 + x3 + x1^2 + x2^2 + x3^2 - solved properly
F4. x1 + x2 + x3 + x1^2 + x2^2 + x3^2 + x1*x2 + x2*x3 + x1*x3 - wrong results
Important:
If I try to solve F2 with normal equation configured for higher dimensions as in F3, the solution is still accurate, as the weights that are not necessary are just very small - about 10^-12.
BUT
When I try to use F4 configuration in F3 problem, redundant weights are not minimized as should be expected. All the function weights are very high, not matching the solution.
The first explanation I thought of is that the columns are dependent on each other then and this somehow disrupts normal equation. Unfortunately I do not see any other solution of such functions.
Weights are obtained by following equation. I know I could use numpy, but I needed to use some other way. Anyway it should not have any impact, as the solutions found for first 3 examples are perfect.
weights = matmult(getMatrixInverse(matmult(transposed_x, train_set_x)),
matmult(transposed_x, expected_outputs))
I have prepared some example data:
1 2 2 22.28
4 2 3 37.97
5 3 3 49.17
6 4 6 66.94
where the last number is the correct output - sum of multiplications of inputs and weights. These were prepared with parameters (first value is threshold - not associated with any feature. It is 0 in the example.):
0 4.5 6.7 2.19
In F1 configuration, where approximation is found properly, the weights are about:
[[0.0], [4.499999999999982], [6.699999999999999], [2.190000000000026]]
where first value is threshold, just a value not associated with any feature.
Next 3 values are proper weights. Sum of multiplications gives always proper solution.
In F2 configuration, where approximation is found properly, the weights are about:
[[-9.094947017729282e-13], [4.500000000000059], [6.70000000000141], [2.1899999999996], [-1.8207657603852567e-14], [-3.410605131648481e-13], [-5.684341886080802e-14]]
where first value is threshold, just a value not associated with any feature. As it is not used in function it is very low. Next 3 values are proper weights, and the rest are redundant weights, minimized as expected. Sum of multiplications gives always proper solution.
In f3 configuration:
[[5924.943766734235], [-1322.797113801977], [-953.6679353156687], [-1461.4574242834683], [50.30783678925443], [36.12622388603999], [113.07309358853317], [65.41037286567017], [107.23349570429868], [105.6503964716471]]
Here even threshold is huge, not to mention redundant values. The weights which are necessary are also too big. Sum of multiplications is not even close to solution.
What may cause such problem? How to approximate functions which are described by multiplication of x using normal equation?
python machine-learning linear-regression
python machine-learning linear-regression
asked Jan 3 at 21:18
ManasluManaslu
247
247
add a comment |
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54029940%2fwhy-normal-equation-is-getting-wrong-results-with-certain-set-of-features%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54029940%2fwhy-normal-equation-is-getting-wrong-results-with-certain-set-of-features%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown