distance between two array movie
I want to create a function that calculates the distance between two array movie.
This is my DataFrame :
movie_title movieId Action Adventure Fantasy Sci-Fi Thriller
Avatar 1 1.0 1.0 1.0 1.0 0.0
Spectre 2 1.0 1.0 0.0 0.0 1
John Carter 3 1.0 1.0 0.0 1.0 0.0
then I represent movies as an array :
df_array = userGenreTable.as_matrix(columns=userGenreTable.columns[2:])
Output :
array([[1., 1., 1., ..., 0., 0., 0.],
[1., 1., 1., ..., 0., 0., 0.],
[1., 1., 0., ..., 0., 0., 0.]])
I represented the dataset as a dictionary:
df_2_dict = userGenreTable_2.to_dict('records')
so my question is How can I calculates the distance between two array movie ?
python pandas
add a comment |
I want to create a function that calculates the distance between two array movie.
This is my DataFrame :
movie_title movieId Action Adventure Fantasy Sci-Fi Thriller
Avatar 1 1.0 1.0 1.0 1.0 0.0
Spectre 2 1.0 1.0 0.0 0.0 1
John Carter 3 1.0 1.0 0.0 1.0 0.0
then I represent movies as an array :
df_array = userGenreTable.as_matrix(columns=userGenreTable.columns[2:])
Output :
array([[1., 1., 1., ..., 0., 0., 0.],
[1., 1., 1., ..., 0., 0., 0.],
[1., 1., 0., ..., 0., 0., 0.]])
I represented the dataset as a dictionary:
df_2_dict = userGenreTable_2.to_dict('records')
so my question is How can I calculates the distance between two array movie ?
python pandas
What metric defines "distance"? And "array movie" does not make sense, I assume you mean "movie array" in that it is an array of movie details. If that's the case, the fact that it focuses on movies is pretty much irrelevant since you just have an array of features that are either 1 or 0.
– roganjosh
Dec 28 '18 at 23:42
@roganjosh, yes I want to say " movie array"
– G.M
Dec 28 '18 at 23:46
to compare 2 movies, you do the euclidian distance, that is the square root of the sum of the squared differences. You can even skip the square root part to be faster. In dataframe computations, that'd be row1^2 - row2 ^2 and then sum
– MrE
Dec 29 '18 at 6:56
add a comment |
I want to create a function that calculates the distance between two array movie.
This is my DataFrame :
movie_title movieId Action Adventure Fantasy Sci-Fi Thriller
Avatar 1 1.0 1.0 1.0 1.0 0.0
Spectre 2 1.0 1.0 0.0 0.0 1
John Carter 3 1.0 1.0 0.0 1.0 0.0
then I represent movies as an array :
df_array = userGenreTable.as_matrix(columns=userGenreTable.columns[2:])
Output :
array([[1., 1., 1., ..., 0., 0., 0.],
[1., 1., 1., ..., 0., 0., 0.],
[1., 1., 0., ..., 0., 0., 0.]])
I represented the dataset as a dictionary:
df_2_dict = userGenreTable_2.to_dict('records')
so my question is How can I calculates the distance between two array movie ?
python pandas
I want to create a function that calculates the distance between two array movie.
This is my DataFrame :
movie_title movieId Action Adventure Fantasy Sci-Fi Thriller
Avatar 1 1.0 1.0 1.0 1.0 0.0
Spectre 2 1.0 1.0 0.0 0.0 1
John Carter 3 1.0 1.0 0.0 1.0 0.0
then I represent movies as an array :
df_array = userGenreTable.as_matrix(columns=userGenreTable.columns[2:])
Output :
array([[1., 1., 1., ..., 0., 0., 0.],
[1., 1., 1., ..., 0., 0., 0.],
[1., 1., 0., ..., 0., 0., 0.]])
I represented the dataset as a dictionary:
df_2_dict = userGenreTable_2.to_dict('records')
so my question is How can I calculates the distance between two array movie ?
python pandas
python pandas
asked Dec 28 '18 at 23:39
G.MG.M
317
317
What metric defines "distance"? And "array movie" does not make sense, I assume you mean "movie array" in that it is an array of movie details. If that's the case, the fact that it focuses on movies is pretty much irrelevant since you just have an array of features that are either 1 or 0.
– roganjosh
Dec 28 '18 at 23:42
@roganjosh, yes I want to say " movie array"
– G.M
Dec 28 '18 at 23:46
to compare 2 movies, you do the euclidian distance, that is the square root of the sum of the squared differences. You can even skip the square root part to be faster. In dataframe computations, that'd be row1^2 - row2 ^2 and then sum
– MrE
Dec 29 '18 at 6:56
add a comment |
What metric defines "distance"? And "array movie" does not make sense, I assume you mean "movie array" in that it is an array of movie details. If that's the case, the fact that it focuses on movies is pretty much irrelevant since you just have an array of features that are either 1 or 0.
– roganjosh
Dec 28 '18 at 23:42
@roganjosh, yes I want to say " movie array"
– G.M
Dec 28 '18 at 23:46
to compare 2 movies, you do the euclidian distance, that is the square root of the sum of the squared differences. You can even skip the square root part to be faster. In dataframe computations, that'd be row1^2 - row2 ^2 and then sum
– MrE
Dec 29 '18 at 6:56
What metric defines "distance"? And "array movie" does not make sense, I assume you mean "movie array" in that it is an array of movie details. If that's the case, the fact that it focuses on movies is pretty much irrelevant since you just have an array of features that are either 1 or 0.
– roganjosh
Dec 28 '18 at 23:42
What metric defines "distance"? And "array movie" does not make sense, I assume you mean "movie array" in that it is an array of movie details. If that's the case, the fact that it focuses on movies is pretty much irrelevant since you just have an array of features that are either 1 or 0.
– roganjosh
Dec 28 '18 at 23:42
@roganjosh, yes I want to say " movie array"
– G.M
Dec 28 '18 at 23:46
@roganjosh, yes I want to say " movie array"
– G.M
Dec 28 '18 at 23:46
to compare 2 movies, you do the euclidian distance, that is the square root of the sum of the squared differences. You can even skip the square root part to be faster. In dataframe computations, that'd be row1^2 - row2 ^2 and then sum
– MrE
Dec 29 '18 at 6:56
to compare 2 movies, you do the euclidian distance, that is the square root of the sum of the squared differences. You can even skip the square root part to be faster. In dataframe computations, that'd be row1^2 - row2 ^2 and then sum
– MrE
Dec 29 '18 at 6:56
add a comment |
1 Answer
1
active
oldest
votes
To obtain distance between all possible pairs in df_array
, you need to calculate a distance matrix. Using scipy.spatial
:
from scipy.spatial import distance_matrix
# p = 2 for euclidean distances
distance_matrix(df_array, df_array, p = 2)
thank you that's what I was looking for
– G.M
Dec 29 '18 at 8:01
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53965453%2fdistance-between-two-array-movie%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
To obtain distance between all possible pairs in df_array
, you need to calculate a distance matrix. Using scipy.spatial
:
from scipy.spatial import distance_matrix
# p = 2 for euclidean distances
distance_matrix(df_array, df_array, p = 2)
thank you that's what I was looking for
– G.M
Dec 29 '18 at 8:01
add a comment |
To obtain distance between all possible pairs in df_array
, you need to calculate a distance matrix. Using scipy.spatial
:
from scipy.spatial import distance_matrix
# p = 2 for euclidean distances
distance_matrix(df_array, df_array, p = 2)
thank you that's what I was looking for
– G.M
Dec 29 '18 at 8:01
add a comment |
To obtain distance between all possible pairs in df_array
, you need to calculate a distance matrix. Using scipy.spatial
:
from scipy.spatial import distance_matrix
# p = 2 for euclidean distances
distance_matrix(df_array, df_array, p = 2)
To obtain distance between all possible pairs in df_array
, you need to calculate a distance matrix. Using scipy.spatial
:
from scipy.spatial import distance_matrix
# p = 2 for euclidean distances
distance_matrix(df_array, df_array, p = 2)
edited Dec 29 '18 at 19:13
answered Dec 29 '18 at 0:01
Mankind_008Mankind_008
1,4862312
1,4862312
thank you that's what I was looking for
– G.M
Dec 29 '18 at 8:01
add a comment |
thank you that's what I was looking for
– G.M
Dec 29 '18 at 8:01
thank you that's what I was looking for
– G.M
Dec 29 '18 at 8:01
thank you that's what I was looking for
– G.M
Dec 29 '18 at 8:01
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53965453%2fdistance-between-two-array-movie%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
What metric defines "distance"? And "array movie" does not make sense, I assume you mean "movie array" in that it is an array of movie details. If that's the case, the fact that it focuses on movies is pretty much irrelevant since you just have an array of features that are either 1 or 0.
– roganjosh
Dec 28 '18 at 23:42
@roganjosh, yes I want to say " movie array"
– G.M
Dec 28 '18 at 23:46
to compare 2 movies, you do the euclidian distance, that is the square root of the sum of the squared differences. You can even skip the square root part to be faster. In dataframe computations, that'd be row1^2 - row2 ^2 and then sum
– MrE
Dec 29 '18 at 6:56