How to plot a line on a scatter graph based on theta from regression in Python?





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







0















I am calculating the theta for my AI like so:



theta = opt.fmin_cg(cost, initial_theta, gradient, (newX, y))


Which works great and gives me this output:



Optimization terminated successfully.
Current function value: 0.684355
Iterations: 6
Function evaluations: 15
Gradient evaluations: 15


When I print theta, I get this:



[ 0.         -0.28132729  0.158859  ]


I now want to plot this on my scatter graph as a line, my expected output looks like this:



Expected Output



But when I try to perform this on my graph with the algorithm:



weights * features = weight0 + weight1 * feature1 + weight2 * feature2


Like so:



x_axis = np.array([min(newX[:, 1]), max(newX[:, 1])])
y_axis = x_axis * theta[1:]
ax.plot(x_axis, y_axis, linewidth=2)
plt.show()


The output looks like this:



Actual Output



What should y_axis = x_axis * theta[1:] be to match the algorithm?




Update:




newX derives from my training data frame and is created like this:



newX = np.zeros(shape=(x.shape[0], x.shape[1] + 1))
newX[:, 1:] = x.values


It now looks like this, the concept is 0 is the free weight:



[[0. 8. 2.]
[0. 0. 0.]
[0. 0. 0.]
...
[0. 0. 0.]
[0. 0. 0.]
[0. 1. 1.]]









share|improve this question

























  • Before you figure out y_axis, what is or how are you getting newX? x_axis doesn't seem to match what you want.

    – busybear
    Jan 3 at 23:02











  • Updated the question for you @busybear

    – Danny_P
    Jan 3 at 23:24


















0















I am calculating the theta for my AI like so:



theta = opt.fmin_cg(cost, initial_theta, gradient, (newX, y))


Which works great and gives me this output:



Optimization terminated successfully.
Current function value: 0.684355
Iterations: 6
Function evaluations: 15
Gradient evaluations: 15


When I print theta, I get this:



[ 0.         -0.28132729  0.158859  ]


I now want to plot this on my scatter graph as a line, my expected output looks like this:



Expected Output



But when I try to perform this on my graph with the algorithm:



weights * features = weight0 + weight1 * feature1 + weight2 * feature2


Like so:



x_axis = np.array([min(newX[:, 1]), max(newX[:, 1])])
y_axis = x_axis * theta[1:]
ax.plot(x_axis, y_axis, linewidth=2)
plt.show()


The output looks like this:



Actual Output



What should y_axis = x_axis * theta[1:] be to match the algorithm?




Update:




newX derives from my training data frame and is created like this:



newX = np.zeros(shape=(x.shape[0], x.shape[1] + 1))
newX[:, 1:] = x.values


It now looks like this, the concept is 0 is the free weight:



[[0. 8. 2.]
[0. 0. 0.]
[0. 0. 0.]
...
[0. 0. 0.]
[0. 0. 0.]
[0. 1. 1.]]









share|improve this question

























  • Before you figure out y_axis, what is or how are you getting newX? x_axis doesn't seem to match what you want.

    – busybear
    Jan 3 at 23:02











  • Updated the question for you @busybear

    – Danny_P
    Jan 3 at 23:24














0












0








0








I am calculating the theta for my AI like so:



theta = opt.fmin_cg(cost, initial_theta, gradient, (newX, y))


Which works great and gives me this output:



Optimization terminated successfully.
Current function value: 0.684355
Iterations: 6
Function evaluations: 15
Gradient evaluations: 15


When I print theta, I get this:



[ 0.         -0.28132729  0.158859  ]


I now want to plot this on my scatter graph as a line, my expected output looks like this:



Expected Output



But when I try to perform this on my graph with the algorithm:



weights * features = weight0 + weight1 * feature1 + weight2 * feature2


Like so:



x_axis = np.array([min(newX[:, 1]), max(newX[:, 1])])
y_axis = x_axis * theta[1:]
ax.plot(x_axis, y_axis, linewidth=2)
plt.show()


The output looks like this:



Actual Output



What should y_axis = x_axis * theta[1:] be to match the algorithm?




Update:




newX derives from my training data frame and is created like this:



newX = np.zeros(shape=(x.shape[0], x.shape[1] + 1))
newX[:, 1:] = x.values


It now looks like this, the concept is 0 is the free weight:



[[0. 8. 2.]
[0. 0. 0.]
[0. 0. 0.]
...
[0. 0. 0.]
[0. 0. 0.]
[0. 1. 1.]]









share|improve this question
















I am calculating the theta for my AI like so:



theta = opt.fmin_cg(cost, initial_theta, gradient, (newX, y))


Which works great and gives me this output:



Optimization terminated successfully.
Current function value: 0.684355
Iterations: 6
Function evaluations: 15
Gradient evaluations: 15


When I print theta, I get this:



[ 0.         -0.28132729  0.158859  ]


I now want to plot this on my scatter graph as a line, my expected output looks like this:



Expected Output



But when I try to perform this on my graph with the algorithm:



weights * features = weight0 + weight1 * feature1 + weight2 * feature2


Like so:



x_axis = np.array([min(newX[:, 1]), max(newX[:, 1])])
y_axis = x_axis * theta[1:]
ax.plot(x_axis, y_axis, linewidth=2)
plt.show()


The output looks like this:



Actual Output



What should y_axis = x_axis * theta[1:] be to match the algorithm?




Update:




newX derives from my training data frame and is created like this:



newX = np.zeros(shape=(x.shape[0], x.shape[1] + 1))
newX[:, 1:] = x.values


It now looks like this, the concept is 0 is the free weight:



[[0. 8. 2.]
[0. 0. 0.]
[0. 0. 0.]
...
[0. 0. 0.]
[0. 0. 0.]
[0. 1. 1.]]






python numpy logistic-regression






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 3 at 23:23







Danny_P

















asked Jan 3 at 22:50









Danny_PDanny_P

715




715













  • Before you figure out y_axis, what is or how are you getting newX? x_axis doesn't seem to match what you want.

    – busybear
    Jan 3 at 23:02











  • Updated the question for you @busybear

    – Danny_P
    Jan 3 at 23:24



















  • Before you figure out y_axis, what is or how are you getting newX? x_axis doesn't seem to match what you want.

    – busybear
    Jan 3 at 23:02











  • Updated the question for you @busybear

    – Danny_P
    Jan 3 at 23:24

















Before you figure out y_axis, what is or how are you getting newX? x_axis doesn't seem to match what you want.

– busybear
Jan 3 at 23:02





Before you figure out y_axis, what is or how are you getting newX? x_axis doesn't seem to match what you want.

– busybear
Jan 3 at 23:02













Updated the question for you @busybear

– Danny_P
Jan 3 at 23:24





Updated the question for you @busybear

– Danny_P
Jan 3 at 23:24












1 Answer
1






active

oldest

votes


















2














IIUC, you are trying to plot your decision boundary for the logistic regression. This is not simply a y = mx + b problem, well it is but you first need to determine where your decision boundary is, typically it's at probability of 0.5. I assume the model you are going with something looks like h(x) = g(theta_0*x_0 + theta_1*x_1 + theta_2*x_2), where g(x) = 1 / (1 + e^-x) and x_1 and x_2 are your features that you are plotting, ie your y and x axis (I don't know which is y and which is x since I don't know your data). So for probability 0.5, you want to solve for h(x) = 0.5, ie theta_0*x_0 + theta_1*x_1 + theta_2*x_2 = 0



So what you want to plot is the line 0 = theta_0*x_0 + theta_1*x_1 + theta_2*x_2. Let's just say you have x_1 on your x axis and x_2 on your y axis. (x_0 is just 1, corresponding to theta_0, your intercept.)



So you'll need to pick (somewhat arbitrarily) x_1 values that will give you a good illustration of the boundary line. Min/max of your dataset works, which you've done. Then solve for x_2 given the formula above. You can define a function here: lambda x_1: (theta[0] + theta[1] * x_1) / theta[2]. I'm assuming your theta variable corresponds to [intercept, coeff for x_1, coeff for x_2]. So you'll end up with something like:



theta = [0., -0.28132729, 0.158859]
x = np.array([0, 10])
f = lambda x_1: (theta[0] + theta[1] * x_1) / theta[2]
y = f(x)
plt.plot(x, y)





share|improve this answer


























  • Thank you for the clarity, this has helped a lot!

    – Danny_P
    Jan 7 at 22:24











  • I think x contains two features, newX has added a new column with 0 values to act as a free weight and he is trying to show the line based on the prediction. So his line should go from 3.5 (x_axis) to 3.5 (y_axis) based on his data. since the regression should see that from 3.5 (ish) the survival rate drops to 0 on average. I got the data from his previous question here and question here

    – Jaquarh
    Jan 7 at 23:06














Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54030890%2fhow-to-plot-a-line-on-a-scatter-graph-based-on-theta-from-regression-in-python%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









2














IIUC, you are trying to plot your decision boundary for the logistic regression. This is not simply a y = mx + b problem, well it is but you first need to determine where your decision boundary is, typically it's at probability of 0.5. I assume the model you are going with something looks like h(x) = g(theta_0*x_0 + theta_1*x_1 + theta_2*x_2), where g(x) = 1 / (1 + e^-x) and x_1 and x_2 are your features that you are plotting, ie your y and x axis (I don't know which is y and which is x since I don't know your data). So for probability 0.5, you want to solve for h(x) = 0.5, ie theta_0*x_0 + theta_1*x_1 + theta_2*x_2 = 0



So what you want to plot is the line 0 = theta_0*x_0 + theta_1*x_1 + theta_2*x_2. Let's just say you have x_1 on your x axis and x_2 on your y axis. (x_0 is just 1, corresponding to theta_0, your intercept.)



So you'll need to pick (somewhat arbitrarily) x_1 values that will give you a good illustration of the boundary line. Min/max of your dataset works, which you've done. Then solve for x_2 given the formula above. You can define a function here: lambda x_1: (theta[0] + theta[1] * x_1) / theta[2]. I'm assuming your theta variable corresponds to [intercept, coeff for x_1, coeff for x_2]. So you'll end up with something like:



theta = [0., -0.28132729, 0.158859]
x = np.array([0, 10])
f = lambda x_1: (theta[0] + theta[1] * x_1) / theta[2]
y = f(x)
plt.plot(x, y)





share|improve this answer


























  • Thank you for the clarity, this has helped a lot!

    – Danny_P
    Jan 7 at 22:24











  • I think x contains two features, newX has added a new column with 0 values to act as a free weight and he is trying to show the line based on the prediction. So his line should go from 3.5 (x_axis) to 3.5 (y_axis) based on his data. since the regression should see that from 3.5 (ish) the survival rate drops to 0 on average. I got the data from his previous question here and question here

    – Jaquarh
    Jan 7 at 23:06


















2














IIUC, you are trying to plot your decision boundary for the logistic regression. This is not simply a y = mx + b problem, well it is but you first need to determine where your decision boundary is, typically it's at probability of 0.5. I assume the model you are going with something looks like h(x) = g(theta_0*x_0 + theta_1*x_1 + theta_2*x_2), where g(x) = 1 / (1 + e^-x) and x_1 and x_2 are your features that you are plotting, ie your y and x axis (I don't know which is y and which is x since I don't know your data). So for probability 0.5, you want to solve for h(x) = 0.5, ie theta_0*x_0 + theta_1*x_1 + theta_2*x_2 = 0



So what you want to plot is the line 0 = theta_0*x_0 + theta_1*x_1 + theta_2*x_2. Let's just say you have x_1 on your x axis and x_2 on your y axis. (x_0 is just 1, corresponding to theta_0, your intercept.)



So you'll need to pick (somewhat arbitrarily) x_1 values that will give you a good illustration of the boundary line. Min/max of your dataset works, which you've done. Then solve for x_2 given the formula above. You can define a function here: lambda x_1: (theta[0] + theta[1] * x_1) / theta[2]. I'm assuming your theta variable corresponds to [intercept, coeff for x_1, coeff for x_2]. So you'll end up with something like:



theta = [0., -0.28132729, 0.158859]
x = np.array([0, 10])
f = lambda x_1: (theta[0] + theta[1] * x_1) / theta[2]
y = f(x)
plt.plot(x, y)





share|improve this answer


























  • Thank you for the clarity, this has helped a lot!

    – Danny_P
    Jan 7 at 22:24











  • I think x contains two features, newX has added a new column with 0 values to act as a free weight and he is trying to show the line based on the prediction. So his line should go from 3.5 (x_axis) to 3.5 (y_axis) based on his data. since the regression should see that from 3.5 (ish) the survival rate drops to 0 on average. I got the data from his previous question here and question here

    – Jaquarh
    Jan 7 at 23:06
















2












2








2







IIUC, you are trying to plot your decision boundary for the logistic regression. This is not simply a y = mx + b problem, well it is but you first need to determine where your decision boundary is, typically it's at probability of 0.5. I assume the model you are going with something looks like h(x) = g(theta_0*x_0 + theta_1*x_1 + theta_2*x_2), where g(x) = 1 / (1 + e^-x) and x_1 and x_2 are your features that you are plotting, ie your y and x axis (I don't know which is y and which is x since I don't know your data). So for probability 0.5, you want to solve for h(x) = 0.5, ie theta_0*x_0 + theta_1*x_1 + theta_2*x_2 = 0



So what you want to plot is the line 0 = theta_0*x_0 + theta_1*x_1 + theta_2*x_2. Let's just say you have x_1 on your x axis and x_2 on your y axis. (x_0 is just 1, corresponding to theta_0, your intercept.)



So you'll need to pick (somewhat arbitrarily) x_1 values that will give you a good illustration of the boundary line. Min/max of your dataset works, which you've done. Then solve for x_2 given the formula above. You can define a function here: lambda x_1: (theta[0] + theta[1] * x_1) / theta[2]. I'm assuming your theta variable corresponds to [intercept, coeff for x_1, coeff for x_2]. So you'll end up with something like:



theta = [0., -0.28132729, 0.158859]
x = np.array([0, 10])
f = lambda x_1: (theta[0] + theta[1] * x_1) / theta[2]
y = f(x)
plt.plot(x, y)





share|improve this answer















IIUC, you are trying to plot your decision boundary for the logistic regression. This is not simply a y = mx + b problem, well it is but you first need to determine where your decision boundary is, typically it's at probability of 0.5. I assume the model you are going with something looks like h(x) = g(theta_0*x_0 + theta_1*x_1 + theta_2*x_2), where g(x) = 1 / (1 + e^-x) and x_1 and x_2 are your features that you are plotting, ie your y and x axis (I don't know which is y and which is x since I don't know your data). So for probability 0.5, you want to solve for h(x) = 0.5, ie theta_0*x_0 + theta_1*x_1 + theta_2*x_2 = 0



So what you want to plot is the line 0 = theta_0*x_0 + theta_1*x_1 + theta_2*x_2. Let's just say you have x_1 on your x axis and x_2 on your y axis. (x_0 is just 1, corresponding to theta_0, your intercept.)



So you'll need to pick (somewhat arbitrarily) x_1 values that will give you a good illustration of the boundary line. Min/max of your dataset works, which you've done. Then solve for x_2 given the formula above. You can define a function here: lambda x_1: (theta[0] + theta[1] * x_1) / theta[2]. I'm assuming your theta variable corresponds to [intercept, coeff for x_1, coeff for x_2]. So you'll end up with something like:



theta = [0., -0.28132729, 0.158859]
x = np.array([0, 10])
f = lambda x_1: (theta[0] + theta[1] * x_1) / theta[2]
y = f(x)
plt.plot(x, y)






share|improve this answer














share|improve this answer



share|improve this answer








edited Jan 4 at 6:24

























answered Jan 4 at 1:40









busybearbusybear

3,74011027




3,74011027













  • Thank you for the clarity, this has helped a lot!

    – Danny_P
    Jan 7 at 22:24











  • I think x contains two features, newX has added a new column with 0 values to act as a free weight and he is trying to show the line based on the prediction. So his line should go from 3.5 (x_axis) to 3.5 (y_axis) based on his data. since the regression should see that from 3.5 (ish) the survival rate drops to 0 on average. I got the data from his previous question here and question here

    – Jaquarh
    Jan 7 at 23:06





















  • Thank you for the clarity, this has helped a lot!

    – Danny_P
    Jan 7 at 22:24











  • I think x contains two features, newX has added a new column with 0 values to act as a free weight and he is trying to show the line based on the prediction. So his line should go from 3.5 (x_axis) to 3.5 (y_axis) based on his data. since the regression should see that from 3.5 (ish) the survival rate drops to 0 on average. I got the data from his previous question here and question here

    – Jaquarh
    Jan 7 at 23:06



















Thank you for the clarity, this has helped a lot!

– Danny_P
Jan 7 at 22:24





Thank you for the clarity, this has helped a lot!

– Danny_P
Jan 7 at 22:24













I think x contains two features, newX has added a new column with 0 values to act as a free weight and he is trying to show the line based on the prediction. So his line should go from 3.5 (x_axis) to 3.5 (y_axis) based on his data. since the regression should see that from 3.5 (ish) the survival rate drops to 0 on average. I got the data from his previous question here and question here

– Jaquarh
Jan 7 at 23:06







I think x contains two features, newX has added a new column with 0 values to act as a free weight and he is trying to show the line based on the prediction. So his line should go from 3.5 (x_axis) to 3.5 (y_axis) based on his data. since the regression should see that from 3.5 (ish) the survival rate drops to 0 on average. I got the data from his previous question here and question here

– Jaquarh
Jan 7 at 23:06






















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54030890%2fhow-to-plot-a-line-on-a-scatter-graph-based-on-theta-from-regression-in-python%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Monofisismo

Angular Downloading a file using contenturl with Basic Authentication

Olmecas