How to plot a line on a scatter graph based on theta from regression in Python?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
I am calculating the theta for my AI like so:
theta = opt.fmin_cg(cost, initial_theta, gradient, (newX, y))
Which works great and gives me this output:
Optimization terminated successfully.
Current function value: 0.684355
Iterations: 6
Function evaluations: 15
Gradient evaluations: 15
When I print theta, I get this:
[ 0. -0.28132729 0.158859 ]
I now want to plot this on my scatter graph as a line, my expected output looks like this:
But when I try to perform this on my graph with the algorithm:
weights * features = weight0 + weight1 * feature1 + weight2 * feature2
Like so:
x_axis = np.array([min(newX[:, 1]), max(newX[:, 1])])
y_axis = x_axis * theta[1:]
ax.plot(x_axis, y_axis, linewidth=2)
plt.show()
The output looks like this:
What should y_axis = x_axis * theta[1:]
be to match the algorithm?
Update:
newX
derives from my training data frame and is created like this:
newX = np.zeros(shape=(x.shape[0], x.shape[1] + 1))
newX[:, 1:] = x.values
It now looks like this, the concept is 0 is the free weight:
[[0. 8. 2.]
[0. 0. 0.]
[0. 0. 0.]
...
[0. 0. 0.]
[0. 0. 0.]
[0. 1. 1.]]
python numpy logistic-regression
add a comment |
I am calculating the theta for my AI like so:
theta = opt.fmin_cg(cost, initial_theta, gradient, (newX, y))
Which works great and gives me this output:
Optimization terminated successfully.
Current function value: 0.684355
Iterations: 6
Function evaluations: 15
Gradient evaluations: 15
When I print theta, I get this:
[ 0. -0.28132729 0.158859 ]
I now want to plot this on my scatter graph as a line, my expected output looks like this:
But when I try to perform this on my graph with the algorithm:
weights * features = weight0 + weight1 * feature1 + weight2 * feature2
Like so:
x_axis = np.array([min(newX[:, 1]), max(newX[:, 1])])
y_axis = x_axis * theta[1:]
ax.plot(x_axis, y_axis, linewidth=2)
plt.show()
The output looks like this:
What should y_axis = x_axis * theta[1:]
be to match the algorithm?
Update:
newX
derives from my training data frame and is created like this:
newX = np.zeros(shape=(x.shape[0], x.shape[1] + 1))
newX[:, 1:] = x.values
It now looks like this, the concept is 0 is the free weight:
[[0. 8. 2.]
[0. 0. 0.]
[0. 0. 0.]
...
[0. 0. 0.]
[0. 0. 0.]
[0. 1. 1.]]
python numpy logistic-regression
Before you figure outy_axis
, what is or how are you gettingnewX
?x_axis
doesn't seem to match what you want.
– busybear
Jan 3 at 23:02
Updated the question for you @busybear
– Danny_P
Jan 3 at 23:24
add a comment |
I am calculating the theta for my AI like so:
theta = opt.fmin_cg(cost, initial_theta, gradient, (newX, y))
Which works great and gives me this output:
Optimization terminated successfully.
Current function value: 0.684355
Iterations: 6
Function evaluations: 15
Gradient evaluations: 15
When I print theta, I get this:
[ 0. -0.28132729 0.158859 ]
I now want to plot this on my scatter graph as a line, my expected output looks like this:
But when I try to perform this on my graph with the algorithm:
weights * features = weight0 + weight1 * feature1 + weight2 * feature2
Like so:
x_axis = np.array([min(newX[:, 1]), max(newX[:, 1])])
y_axis = x_axis * theta[1:]
ax.plot(x_axis, y_axis, linewidth=2)
plt.show()
The output looks like this:
What should y_axis = x_axis * theta[1:]
be to match the algorithm?
Update:
newX
derives from my training data frame and is created like this:
newX = np.zeros(shape=(x.shape[0], x.shape[1] + 1))
newX[:, 1:] = x.values
It now looks like this, the concept is 0 is the free weight:
[[0. 8. 2.]
[0. 0. 0.]
[0. 0. 0.]
...
[0. 0. 0.]
[0. 0. 0.]
[0. 1. 1.]]
python numpy logistic-regression
I am calculating the theta for my AI like so:
theta = opt.fmin_cg(cost, initial_theta, gradient, (newX, y))
Which works great and gives me this output:
Optimization terminated successfully.
Current function value: 0.684355
Iterations: 6
Function evaluations: 15
Gradient evaluations: 15
When I print theta, I get this:
[ 0. -0.28132729 0.158859 ]
I now want to plot this on my scatter graph as a line, my expected output looks like this:
But when I try to perform this on my graph with the algorithm:
weights * features = weight0 + weight1 * feature1 + weight2 * feature2
Like so:
x_axis = np.array([min(newX[:, 1]), max(newX[:, 1])])
y_axis = x_axis * theta[1:]
ax.plot(x_axis, y_axis, linewidth=2)
plt.show()
The output looks like this:
What should y_axis = x_axis * theta[1:]
be to match the algorithm?
Update:
newX
derives from my training data frame and is created like this:
newX = np.zeros(shape=(x.shape[0], x.shape[1] + 1))
newX[:, 1:] = x.values
It now looks like this, the concept is 0 is the free weight:
[[0. 8. 2.]
[0. 0. 0.]
[0. 0. 0.]
...
[0. 0. 0.]
[0. 0. 0.]
[0. 1. 1.]]
python numpy logistic-regression
python numpy logistic-regression
edited Jan 3 at 23:23
Danny_P
asked Jan 3 at 22:50
Danny_PDanny_P
715
715
Before you figure outy_axis
, what is or how are you gettingnewX
?x_axis
doesn't seem to match what you want.
– busybear
Jan 3 at 23:02
Updated the question for you @busybear
– Danny_P
Jan 3 at 23:24
add a comment |
Before you figure outy_axis
, what is or how are you gettingnewX
?x_axis
doesn't seem to match what you want.
– busybear
Jan 3 at 23:02
Updated the question for you @busybear
– Danny_P
Jan 3 at 23:24
Before you figure out
y_axis
, what is or how are you getting newX
? x_axis
doesn't seem to match what you want.– busybear
Jan 3 at 23:02
Before you figure out
y_axis
, what is or how are you getting newX
? x_axis
doesn't seem to match what you want.– busybear
Jan 3 at 23:02
Updated the question for you @busybear
– Danny_P
Jan 3 at 23:24
Updated the question for you @busybear
– Danny_P
Jan 3 at 23:24
add a comment |
1 Answer
1
active
oldest
votes
IIUC, you are trying to plot your decision boundary for the logistic regression. This is not simply a y = mx + b problem, well it is but you first need to determine where your decision boundary is, typically it's at probability of 0.5. I assume the model you are going with something looks like h(x) = g(theta_0*x_0 + theta_1*x_1 + theta_2*x_2)
, where g(x) = 1 / (1 + e^-x)
and x_1
and x_2
are your features that you are plotting, ie your y and x axis (I don't know which is y and which is x since I don't know your data). So for probability 0.5, you want to solve for h(x) = 0.5
, ie theta_0*x_0 + theta_1*x_1 + theta_2*x_2 = 0
So what you want to plot is the line 0 = theta_0*x_0 + theta_1*x_1 + theta_2*x_2
. Let's just say you have x_1
on your x axis and x_2
on your y axis. (x_0
is just 1
, corresponding to theta_0
, your intercept.)
So you'll need to pick (somewhat arbitrarily) x_1
values that will give you a good illustration of the boundary line. Min/max of your dataset works, which you've done. Then solve for x_2
given the formula above. You can define a function here: lambda x_1: (theta[0] + theta[1] * x_1) / theta[2]
. I'm assuming your theta
variable corresponds to [intercept, coeff for x_1, coeff for x_2]
. So you'll end up with something like:
theta = [0., -0.28132729, 0.158859]
x = np.array([0, 10])
f = lambda x_1: (theta[0] + theta[1] * x_1) / theta[2]
y = f(x)
plt.plot(x, y)
Thank you for the clarity, this has helped a lot!
– Danny_P
Jan 7 at 22:24
I thinkx
contains two features,newX
has added a new column with0
values to act as a free weight and he is trying to show the line based on the prediction. So his line should go from 3.5 (x_axis) to 3.5 (y_axis) based on his data. since the regression should see that from 3.5 (ish) the survival rate drops to 0 on average. I got the data from his previous question here and question here
– Jaquarh
Jan 7 at 23:06
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54030890%2fhow-to-plot-a-line-on-a-scatter-graph-based-on-theta-from-regression-in-python%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
IIUC, you are trying to plot your decision boundary for the logistic regression. This is not simply a y = mx + b problem, well it is but you first need to determine where your decision boundary is, typically it's at probability of 0.5. I assume the model you are going with something looks like h(x) = g(theta_0*x_0 + theta_1*x_1 + theta_2*x_2)
, where g(x) = 1 / (1 + e^-x)
and x_1
and x_2
are your features that you are plotting, ie your y and x axis (I don't know which is y and which is x since I don't know your data). So for probability 0.5, you want to solve for h(x) = 0.5
, ie theta_0*x_0 + theta_1*x_1 + theta_2*x_2 = 0
So what you want to plot is the line 0 = theta_0*x_0 + theta_1*x_1 + theta_2*x_2
. Let's just say you have x_1
on your x axis and x_2
on your y axis. (x_0
is just 1
, corresponding to theta_0
, your intercept.)
So you'll need to pick (somewhat arbitrarily) x_1
values that will give you a good illustration of the boundary line. Min/max of your dataset works, which you've done. Then solve for x_2
given the formula above. You can define a function here: lambda x_1: (theta[0] + theta[1] * x_1) / theta[2]
. I'm assuming your theta
variable corresponds to [intercept, coeff for x_1, coeff for x_2]
. So you'll end up with something like:
theta = [0., -0.28132729, 0.158859]
x = np.array([0, 10])
f = lambda x_1: (theta[0] + theta[1] * x_1) / theta[2]
y = f(x)
plt.plot(x, y)
Thank you for the clarity, this has helped a lot!
– Danny_P
Jan 7 at 22:24
I thinkx
contains two features,newX
has added a new column with0
values to act as a free weight and he is trying to show the line based on the prediction. So his line should go from 3.5 (x_axis) to 3.5 (y_axis) based on his data. since the regression should see that from 3.5 (ish) the survival rate drops to 0 on average. I got the data from his previous question here and question here
– Jaquarh
Jan 7 at 23:06
add a comment |
IIUC, you are trying to plot your decision boundary for the logistic regression. This is not simply a y = mx + b problem, well it is but you first need to determine where your decision boundary is, typically it's at probability of 0.5. I assume the model you are going with something looks like h(x) = g(theta_0*x_0 + theta_1*x_1 + theta_2*x_2)
, where g(x) = 1 / (1 + e^-x)
and x_1
and x_2
are your features that you are plotting, ie your y and x axis (I don't know which is y and which is x since I don't know your data). So for probability 0.5, you want to solve for h(x) = 0.5
, ie theta_0*x_0 + theta_1*x_1 + theta_2*x_2 = 0
So what you want to plot is the line 0 = theta_0*x_0 + theta_1*x_1 + theta_2*x_2
. Let's just say you have x_1
on your x axis and x_2
on your y axis. (x_0
is just 1
, corresponding to theta_0
, your intercept.)
So you'll need to pick (somewhat arbitrarily) x_1
values that will give you a good illustration of the boundary line. Min/max of your dataset works, which you've done. Then solve for x_2
given the formula above. You can define a function here: lambda x_1: (theta[0] + theta[1] * x_1) / theta[2]
. I'm assuming your theta
variable corresponds to [intercept, coeff for x_1, coeff for x_2]
. So you'll end up with something like:
theta = [0., -0.28132729, 0.158859]
x = np.array([0, 10])
f = lambda x_1: (theta[0] + theta[1] * x_1) / theta[2]
y = f(x)
plt.plot(x, y)
Thank you for the clarity, this has helped a lot!
– Danny_P
Jan 7 at 22:24
I thinkx
contains two features,newX
has added a new column with0
values to act as a free weight and he is trying to show the line based on the prediction. So his line should go from 3.5 (x_axis) to 3.5 (y_axis) based on his data. since the regression should see that from 3.5 (ish) the survival rate drops to 0 on average. I got the data from his previous question here and question here
– Jaquarh
Jan 7 at 23:06
add a comment |
IIUC, you are trying to plot your decision boundary for the logistic regression. This is not simply a y = mx + b problem, well it is but you first need to determine where your decision boundary is, typically it's at probability of 0.5. I assume the model you are going with something looks like h(x) = g(theta_0*x_0 + theta_1*x_1 + theta_2*x_2)
, where g(x) = 1 / (1 + e^-x)
and x_1
and x_2
are your features that you are plotting, ie your y and x axis (I don't know which is y and which is x since I don't know your data). So for probability 0.5, you want to solve for h(x) = 0.5
, ie theta_0*x_0 + theta_1*x_1 + theta_2*x_2 = 0
So what you want to plot is the line 0 = theta_0*x_0 + theta_1*x_1 + theta_2*x_2
. Let's just say you have x_1
on your x axis and x_2
on your y axis. (x_0
is just 1
, corresponding to theta_0
, your intercept.)
So you'll need to pick (somewhat arbitrarily) x_1
values that will give you a good illustration of the boundary line. Min/max of your dataset works, which you've done. Then solve for x_2
given the formula above. You can define a function here: lambda x_1: (theta[0] + theta[1] * x_1) / theta[2]
. I'm assuming your theta
variable corresponds to [intercept, coeff for x_1, coeff for x_2]
. So you'll end up with something like:
theta = [0., -0.28132729, 0.158859]
x = np.array([0, 10])
f = lambda x_1: (theta[0] + theta[1] * x_1) / theta[2]
y = f(x)
plt.plot(x, y)
IIUC, you are trying to plot your decision boundary for the logistic regression. This is not simply a y = mx + b problem, well it is but you first need to determine where your decision boundary is, typically it's at probability of 0.5. I assume the model you are going with something looks like h(x) = g(theta_0*x_0 + theta_1*x_1 + theta_2*x_2)
, where g(x) = 1 / (1 + e^-x)
and x_1
and x_2
are your features that you are plotting, ie your y and x axis (I don't know which is y and which is x since I don't know your data). So for probability 0.5, you want to solve for h(x) = 0.5
, ie theta_0*x_0 + theta_1*x_1 + theta_2*x_2 = 0
So what you want to plot is the line 0 = theta_0*x_0 + theta_1*x_1 + theta_2*x_2
. Let's just say you have x_1
on your x axis and x_2
on your y axis. (x_0
is just 1
, corresponding to theta_0
, your intercept.)
So you'll need to pick (somewhat arbitrarily) x_1
values that will give you a good illustration of the boundary line. Min/max of your dataset works, which you've done. Then solve for x_2
given the formula above. You can define a function here: lambda x_1: (theta[0] + theta[1] * x_1) / theta[2]
. I'm assuming your theta
variable corresponds to [intercept, coeff for x_1, coeff for x_2]
. So you'll end up with something like:
theta = [0., -0.28132729, 0.158859]
x = np.array([0, 10])
f = lambda x_1: (theta[0] + theta[1] * x_1) / theta[2]
y = f(x)
plt.plot(x, y)
edited Jan 4 at 6:24
answered Jan 4 at 1:40
busybearbusybear
3,74011027
3,74011027
Thank you for the clarity, this has helped a lot!
– Danny_P
Jan 7 at 22:24
I thinkx
contains two features,newX
has added a new column with0
values to act as a free weight and he is trying to show the line based on the prediction. So his line should go from 3.5 (x_axis) to 3.5 (y_axis) based on his data. since the regression should see that from 3.5 (ish) the survival rate drops to 0 on average. I got the data from his previous question here and question here
– Jaquarh
Jan 7 at 23:06
add a comment |
Thank you for the clarity, this has helped a lot!
– Danny_P
Jan 7 at 22:24
I thinkx
contains two features,newX
has added a new column with0
values to act as a free weight and he is trying to show the line based on the prediction. So his line should go from 3.5 (x_axis) to 3.5 (y_axis) based on his data. since the regression should see that from 3.5 (ish) the survival rate drops to 0 on average. I got the data from his previous question here and question here
– Jaquarh
Jan 7 at 23:06
Thank you for the clarity, this has helped a lot!
– Danny_P
Jan 7 at 22:24
Thank you for the clarity, this has helped a lot!
– Danny_P
Jan 7 at 22:24
I think
x
contains two features, newX
has added a new column with 0
values to act as a free weight and he is trying to show the line based on the prediction. So his line should go from 3.5 (x_axis) to 3.5 (y_axis) based on his data. since the regression should see that from 3.5 (ish) the survival rate drops to 0 on average. I got the data from his previous question here and question here– Jaquarh
Jan 7 at 23:06
I think
x
contains two features, newX
has added a new column with 0
values to act as a free weight and he is trying to show the line based on the prediction. So his line should go from 3.5 (x_axis) to 3.5 (y_axis) based on his data. since the regression should see that from 3.5 (ish) the survival rate drops to 0 on average. I got the data from his previous question here and question here– Jaquarh
Jan 7 at 23:06
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54030890%2fhow-to-plot-a-line-on-a-scatter-graph-based-on-theta-from-regression-in-python%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Before you figure out
y_axis
, what is or how are you gettingnewX
?x_axis
doesn't seem to match what you want.– busybear
Jan 3 at 23:02
Updated the question for you @busybear
– Danny_P
Jan 3 at 23:24