PRAW 6: Get all submission of a subreddit
I'm trying to iterate over submissions of a certain subreddit from the newest to the oldest using PRAW. I used to do it like this:
subreddit = reddit.subreddit('LandscapePhotography')
for submission in subreddit.submissions(None, time.time()):
print("Submission Title: {}".format(submission.title))
However, when I try to do it now I get the following error:
AttributeError: 'Subreddit' object has no attribute 'submissions'
From looking at the docs I can't seem to figure out how to do this. The best I can do is:
for submission in subreddit.new(limit=None):
print("Submission Title: {}".format(submission.title))
However, this is limited to the first 1000 submissions only.
Is there a way to do this with all submissions and not just the first 1000 ?
python reddit praw
add a comment |
I'm trying to iterate over submissions of a certain subreddit from the newest to the oldest using PRAW. I used to do it like this:
subreddit = reddit.subreddit('LandscapePhotography')
for submission in subreddit.submissions(None, time.time()):
print("Submission Title: {}".format(submission.title))
However, when I try to do it now I get the following error:
AttributeError: 'Subreddit' object has no attribute 'submissions'
From looking at the docs I can't seem to figure out how to do this. The best I can do is:
for submission in subreddit.new(limit=None):
print("Submission Title: {}".format(submission.title))
However, this is limited to the first 1000 submissions only.
Is there a way to do this with all submissions and not just the first 1000 ?
python reddit praw
Have you tried playing with the times? Like changing it to current time minus the last search time, and getting the results before those
– Shlomi Bazel
Dec 31 '18 at 14:50
@ShlomiBazel Can you elaborate? If I understand correctly, This is what I was doing in the first example. I was saying 'give me all submissions betweenNone
and the current time. Right now I can't find a search based on time values.
– Curtwagner1984
Dec 31 '18 at 15:07
add a comment |
I'm trying to iterate over submissions of a certain subreddit from the newest to the oldest using PRAW. I used to do it like this:
subreddit = reddit.subreddit('LandscapePhotography')
for submission in subreddit.submissions(None, time.time()):
print("Submission Title: {}".format(submission.title))
However, when I try to do it now I get the following error:
AttributeError: 'Subreddit' object has no attribute 'submissions'
From looking at the docs I can't seem to figure out how to do this. The best I can do is:
for submission in subreddit.new(limit=None):
print("Submission Title: {}".format(submission.title))
However, this is limited to the first 1000 submissions only.
Is there a way to do this with all submissions and not just the first 1000 ?
python reddit praw
I'm trying to iterate over submissions of a certain subreddit from the newest to the oldest using PRAW. I used to do it like this:
subreddit = reddit.subreddit('LandscapePhotography')
for submission in subreddit.submissions(None, time.time()):
print("Submission Title: {}".format(submission.title))
However, when I try to do it now I get the following error:
AttributeError: 'Subreddit' object has no attribute 'submissions'
From looking at the docs I can't seem to figure out how to do this. The best I can do is:
for submission in subreddit.new(limit=None):
print("Submission Title: {}".format(submission.title))
However, this is limited to the first 1000 submissions only.
Is there a way to do this with all submissions and not just the first 1000 ?
python reddit praw
python reddit praw
asked Dec 31 '18 at 14:37
Curtwagner1984Curtwagner1984
2361717
2361717
Have you tried playing with the times? Like changing it to current time minus the last search time, and getting the results before those
– Shlomi Bazel
Dec 31 '18 at 14:50
@ShlomiBazel Can you elaborate? If I understand correctly, This is what I was doing in the first example. I was saying 'give me all submissions betweenNone
and the current time. Right now I can't find a search based on time values.
– Curtwagner1984
Dec 31 '18 at 15:07
add a comment |
Have you tried playing with the times? Like changing it to current time minus the last search time, and getting the results before those
– Shlomi Bazel
Dec 31 '18 at 14:50
@ShlomiBazel Can you elaborate? If I understand correctly, This is what I was doing in the first example. I was saying 'give me all submissions betweenNone
and the current time. Right now I can't find a search based on time values.
– Curtwagner1984
Dec 31 '18 at 15:07
Have you tried playing with the times? Like changing it to current time minus the last search time, and getting the results before those
– Shlomi Bazel
Dec 31 '18 at 14:50
Have you tried playing with the times? Like changing it to current time minus the last search time, and getting the results before those
– Shlomi Bazel
Dec 31 '18 at 14:50
@ShlomiBazel Can you elaborate? If I understand correctly, This is what I was doing in the first example. I was saying 'give me all submissions between
None
and the current time. Right now I can't find a search based on time values.– Curtwagner1984
Dec 31 '18 at 15:07
@ShlomiBazel Can you elaborate? If I understand correctly, This is what I was doing in the first example. I was saying 'give me all submissions between
None
and the current time. Right now I can't find a search based on time values.– Curtwagner1984
Dec 31 '18 at 15:07
add a comment |
1 Answer
1
active
oldest
votes
Unfortunately, Reddit removed this function from their API.
Check out the PRAW changelog. One of the changes in version 6.0.0 is:
Removed
Subreddit.submissions
as the API endpoint backing the method is no more. See
https://www.reddit.com/r/changelog/comments/7tus5f/update_to_search_api/.
The linked post says that Reddit is disabling Cloudsearch for all users:
Starting March 15, 2018 we’ll begin to gradually move API users over to the new search system. By end of March we expect to have moved everyone off and finally turn down the old system.
PRAW's Subreddit.sumbissions()
used Cloudsearch to search for posts between the given timestamps. Since Cloudsearch has been removed and the search that replaced it doesn't support timestamp search, it is no longer possible to perform a search based on timestamp with PRAW or any other Reddit API client. This includes trying to get all posts from a subreddit.
For more information, see this thread from /r/redditdev posted by the maintainer of PRAW.
Alternatives
Since Reddit limits all listings to ~1000 entries, it is currently impossible to get all posts in a subreddit using their API. However, third-party datasets with APIs exist, such as pushshift.io. As /u/kungming2 said on Reddit:
You can use Pushshift.io to still return data from defined time
periods by using their API:
https://api.pushshift.io/reddit/submission/search/?after=1334426439&before=1339696839&sort_type=score&sort=desc&subreddit=translator
This, for example, allows you to parse submissions to r/translator
between 2012-04-14 and 2012-06-2014.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53988619%2fpraw-6-get-all-submission-of-a-subreddit%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Unfortunately, Reddit removed this function from their API.
Check out the PRAW changelog. One of the changes in version 6.0.0 is:
Removed
Subreddit.submissions
as the API endpoint backing the method is no more. See
https://www.reddit.com/r/changelog/comments/7tus5f/update_to_search_api/.
The linked post says that Reddit is disabling Cloudsearch for all users:
Starting March 15, 2018 we’ll begin to gradually move API users over to the new search system. By end of March we expect to have moved everyone off and finally turn down the old system.
PRAW's Subreddit.sumbissions()
used Cloudsearch to search for posts between the given timestamps. Since Cloudsearch has been removed and the search that replaced it doesn't support timestamp search, it is no longer possible to perform a search based on timestamp with PRAW or any other Reddit API client. This includes trying to get all posts from a subreddit.
For more information, see this thread from /r/redditdev posted by the maintainer of PRAW.
Alternatives
Since Reddit limits all listings to ~1000 entries, it is currently impossible to get all posts in a subreddit using their API. However, third-party datasets with APIs exist, such as pushshift.io. As /u/kungming2 said on Reddit:
You can use Pushshift.io to still return data from defined time
periods by using their API:
https://api.pushshift.io/reddit/submission/search/?after=1334426439&before=1339696839&sort_type=score&sort=desc&subreddit=translator
This, for example, allows you to parse submissions to r/translator
between 2012-04-14 and 2012-06-2014.
add a comment |
Unfortunately, Reddit removed this function from their API.
Check out the PRAW changelog. One of the changes in version 6.0.0 is:
Removed
Subreddit.submissions
as the API endpoint backing the method is no more. See
https://www.reddit.com/r/changelog/comments/7tus5f/update_to_search_api/.
The linked post says that Reddit is disabling Cloudsearch for all users:
Starting March 15, 2018 we’ll begin to gradually move API users over to the new search system. By end of March we expect to have moved everyone off and finally turn down the old system.
PRAW's Subreddit.sumbissions()
used Cloudsearch to search for posts between the given timestamps. Since Cloudsearch has been removed and the search that replaced it doesn't support timestamp search, it is no longer possible to perform a search based on timestamp with PRAW or any other Reddit API client. This includes trying to get all posts from a subreddit.
For more information, see this thread from /r/redditdev posted by the maintainer of PRAW.
Alternatives
Since Reddit limits all listings to ~1000 entries, it is currently impossible to get all posts in a subreddit using their API. However, third-party datasets with APIs exist, such as pushshift.io. As /u/kungming2 said on Reddit:
You can use Pushshift.io to still return data from defined time
periods by using their API:
https://api.pushshift.io/reddit/submission/search/?after=1334426439&before=1339696839&sort_type=score&sort=desc&subreddit=translator
This, for example, allows you to parse submissions to r/translator
between 2012-04-14 and 2012-06-2014.
add a comment |
Unfortunately, Reddit removed this function from their API.
Check out the PRAW changelog. One of the changes in version 6.0.0 is:
Removed
Subreddit.submissions
as the API endpoint backing the method is no more. See
https://www.reddit.com/r/changelog/comments/7tus5f/update_to_search_api/.
The linked post says that Reddit is disabling Cloudsearch for all users:
Starting March 15, 2018 we’ll begin to gradually move API users over to the new search system. By end of March we expect to have moved everyone off and finally turn down the old system.
PRAW's Subreddit.sumbissions()
used Cloudsearch to search for posts between the given timestamps. Since Cloudsearch has been removed and the search that replaced it doesn't support timestamp search, it is no longer possible to perform a search based on timestamp with PRAW or any other Reddit API client. This includes trying to get all posts from a subreddit.
For more information, see this thread from /r/redditdev posted by the maintainer of PRAW.
Alternatives
Since Reddit limits all listings to ~1000 entries, it is currently impossible to get all posts in a subreddit using their API. However, third-party datasets with APIs exist, such as pushshift.io. As /u/kungming2 said on Reddit:
You can use Pushshift.io to still return data from defined time
periods by using their API:
https://api.pushshift.io/reddit/submission/search/?after=1334426439&before=1339696839&sort_type=score&sort=desc&subreddit=translator
This, for example, allows you to parse submissions to r/translator
between 2012-04-14 and 2012-06-2014.
Unfortunately, Reddit removed this function from their API.
Check out the PRAW changelog. One of the changes in version 6.0.0 is:
Removed
Subreddit.submissions
as the API endpoint backing the method is no more. See
https://www.reddit.com/r/changelog/comments/7tus5f/update_to_search_api/.
The linked post says that Reddit is disabling Cloudsearch for all users:
Starting March 15, 2018 we’ll begin to gradually move API users over to the new search system. By end of March we expect to have moved everyone off and finally turn down the old system.
PRAW's Subreddit.sumbissions()
used Cloudsearch to search for posts between the given timestamps. Since Cloudsearch has been removed and the search that replaced it doesn't support timestamp search, it is no longer possible to perform a search based on timestamp with PRAW or any other Reddit API client. This includes trying to get all posts from a subreddit.
For more information, see this thread from /r/redditdev posted by the maintainer of PRAW.
Alternatives
Since Reddit limits all listings to ~1000 entries, it is currently impossible to get all posts in a subreddit using their API. However, third-party datasets with APIs exist, such as pushshift.io. As /u/kungming2 said on Reddit:
You can use Pushshift.io to still return data from defined time
periods by using their API:
https://api.pushshift.io/reddit/submission/search/?after=1334426439&before=1339696839&sort_type=score&sort=desc&subreddit=translator
This, for example, allows you to parse submissions to r/translator
between 2012-04-14 and 2012-06-2014.
answered Jan 4 at 21:29
jarhill0jarhill0
16325
16325
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53988619%2fpraw-6-get-all-submission-of-a-subreddit%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Have you tried playing with the times? Like changing it to current time minus the last search time, and getting the results before those
– Shlomi Bazel
Dec 31 '18 at 14:50
@ShlomiBazel Can you elaborate? If I understand correctly, This is what I was doing in the first example. I was saying 'give me all submissions between
None
and the current time. Right now I can't find a search based on time values.– Curtwagner1984
Dec 31 '18 at 15:07