Need help figuring out how to loop through indice
![Multi tool use Multi tool use](http://sgv.ssvwv.com/sg/ssvwvcomimagb.png)
Multi tool use
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
I am working on a project that revolves scraping a lot of data. I am working on a rather long script right now but running into a problem with my for loop.
I am trying to scrape information out of a 9 row table. I have tried to set up a for loop so that it scrapes the same information from each row. In order to access the first row, I split the table into a list. The first row starts with the third indice.
Here is my Code:
When I run it I get an, "AttributeError" at the line "Aa" is on. The error reads, "'NoneType' object has no attribute 'text'"
This doesn't happen when I feed that line of code individually into the console, I get the desired text. And when I take out the for loop I am able to scrape the first indaplaybox.
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
my_url ='Myurl/=' + page
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html, "html.parser")
boxes = page_soup.findAll("table",{"class":"TableTable tableBody"})
box = boxes[0]
playboxes = box.find_all('tr')
indaplaybox = playboxes[3]
filename = "QBS.csv"
f = open(filename, "a")
headers= "Aa, Ab, Ac, Adn"
f.write(headers)
for indaplaybox in playboxes:
Aa = indaplaybox.find('td', attrs = {'style' : 'font-weight: bold;'}).text
c = indaplaybox.find('td', attrs = {'class' : 'tablePlayName'})
cl = c.text.split()
Ab = cl[0] + " " + cl[1]
Ac = cl[2]
Ad = indaplaybox.div.a.text
print("Aa:" + Aa)
print("Ab:" + Ab)
print("Ac:" + Ac)
print("Ad:" + Ad)
with open (filename, "a") as myfile:
myfile.write(Aa + "," + Ab + "," + Ac.replace(",", "|") + "," + Ad + "n")
f.close()
I want to loop through the playbox indices 3-11.
I am not well versed with indices, so tried to do something like:
p = [str(i) for i in range (3,12)]
indaplaybox = playboxes[p]
for indaplaybox in playboxes:
rest of code
But that doesn't work, because what is probably obvious to most is that list indices must be integers.
I could really use some help thinking through how to get this for loop running smoothly. Thanks!
python for-loop web-scraping indices
add a comment |
I am working on a project that revolves scraping a lot of data. I am working on a rather long script right now but running into a problem with my for loop.
I am trying to scrape information out of a 9 row table. I have tried to set up a for loop so that it scrapes the same information from each row. In order to access the first row, I split the table into a list. The first row starts with the third indice.
Here is my Code:
When I run it I get an, "AttributeError" at the line "Aa" is on. The error reads, "'NoneType' object has no attribute 'text'"
This doesn't happen when I feed that line of code individually into the console, I get the desired text. And when I take out the for loop I am able to scrape the first indaplaybox.
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
my_url ='Myurl/=' + page
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html, "html.parser")
boxes = page_soup.findAll("table",{"class":"TableTable tableBody"})
box = boxes[0]
playboxes = box.find_all('tr')
indaplaybox = playboxes[3]
filename = "QBS.csv"
f = open(filename, "a")
headers= "Aa, Ab, Ac, Adn"
f.write(headers)
for indaplaybox in playboxes:
Aa = indaplaybox.find('td', attrs = {'style' : 'font-weight: bold;'}).text
c = indaplaybox.find('td', attrs = {'class' : 'tablePlayName'})
cl = c.text.split()
Ab = cl[0] + " " + cl[1]
Ac = cl[2]
Ad = indaplaybox.div.a.text
print("Aa:" + Aa)
print("Ab:" + Ab)
print("Ac:" + Ac)
print("Ad:" + Ad)
with open (filename, "a") as myfile:
myfile.write(Aa + "," + Ab + "," + Ac.replace(",", "|") + "," + Ad + "n")
f.close()
I want to loop through the playbox indices 3-11.
I am not well versed with indices, so tried to do something like:
p = [str(i) for i in range (3,12)]
indaplaybox = playboxes[p]
for indaplaybox in playboxes:
rest of code
But that doesn't work, because what is probably obvious to most is that list indices must be integers.
I could really use some help thinking through how to get this for loop running smoothly. Thanks!
python for-loop web-scraping indices
It's probably because bsoup didn't find anything for your requested query. I think it returnsNone
if it can't find anything.
– connectyourcharger
Jan 3 at 22:13
First instinct is thatindaplaybox.find('td', attrs = {'style' : 'font-weight: bold;'})
is returningNone
, and you can't call.text
on that. I see you may have duplicated some variable names;indaplaybox = playboxes[3]
. If it were me, I'd remove or change that line to make sure that's not the issue, thenprint(indaplaybox.find('td', attrs = {'style' : 'font-weight: bold;'}))
without calling.text
and see what is returned
– G. Anderson
Jan 3 at 22:13
Thanks for the help y'all!
– Jordan Freundlich
Jan 3 at 22:31
add a comment |
I am working on a project that revolves scraping a lot of data. I am working on a rather long script right now but running into a problem with my for loop.
I am trying to scrape information out of a 9 row table. I have tried to set up a for loop so that it scrapes the same information from each row. In order to access the first row, I split the table into a list. The first row starts with the third indice.
Here is my Code:
When I run it I get an, "AttributeError" at the line "Aa" is on. The error reads, "'NoneType' object has no attribute 'text'"
This doesn't happen when I feed that line of code individually into the console, I get the desired text. And when I take out the for loop I am able to scrape the first indaplaybox.
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
my_url ='Myurl/=' + page
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html, "html.parser")
boxes = page_soup.findAll("table",{"class":"TableTable tableBody"})
box = boxes[0]
playboxes = box.find_all('tr')
indaplaybox = playboxes[3]
filename = "QBS.csv"
f = open(filename, "a")
headers= "Aa, Ab, Ac, Adn"
f.write(headers)
for indaplaybox in playboxes:
Aa = indaplaybox.find('td', attrs = {'style' : 'font-weight: bold;'}).text
c = indaplaybox.find('td', attrs = {'class' : 'tablePlayName'})
cl = c.text.split()
Ab = cl[0] + " " + cl[1]
Ac = cl[2]
Ad = indaplaybox.div.a.text
print("Aa:" + Aa)
print("Ab:" + Ab)
print("Ac:" + Ac)
print("Ad:" + Ad)
with open (filename, "a") as myfile:
myfile.write(Aa + "," + Ab + "," + Ac.replace(",", "|") + "," + Ad + "n")
f.close()
I want to loop through the playbox indices 3-11.
I am not well versed with indices, so tried to do something like:
p = [str(i) for i in range (3,12)]
indaplaybox = playboxes[p]
for indaplaybox in playboxes:
rest of code
But that doesn't work, because what is probably obvious to most is that list indices must be integers.
I could really use some help thinking through how to get this for loop running smoothly. Thanks!
python for-loop web-scraping indices
I am working on a project that revolves scraping a lot of data. I am working on a rather long script right now but running into a problem with my for loop.
I am trying to scrape information out of a 9 row table. I have tried to set up a for loop so that it scrapes the same information from each row. In order to access the first row, I split the table into a list. The first row starts with the third indice.
Here is my Code:
When I run it I get an, "AttributeError" at the line "Aa" is on. The error reads, "'NoneType' object has no attribute 'text'"
This doesn't happen when I feed that line of code individually into the console, I get the desired text. And when I take out the for loop I am able to scrape the first indaplaybox.
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
my_url ='Myurl/=' + page
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html, "html.parser")
boxes = page_soup.findAll("table",{"class":"TableTable tableBody"})
box = boxes[0]
playboxes = box.find_all('tr')
indaplaybox = playboxes[3]
filename = "QBS.csv"
f = open(filename, "a")
headers= "Aa, Ab, Ac, Adn"
f.write(headers)
for indaplaybox in playboxes:
Aa = indaplaybox.find('td', attrs = {'style' : 'font-weight: bold;'}).text
c = indaplaybox.find('td', attrs = {'class' : 'tablePlayName'})
cl = c.text.split()
Ab = cl[0] + " " + cl[1]
Ac = cl[2]
Ad = indaplaybox.div.a.text
print("Aa:" + Aa)
print("Ab:" + Ab)
print("Ac:" + Ac)
print("Ad:" + Ad)
with open (filename, "a") as myfile:
myfile.write(Aa + "," + Ab + "," + Ac.replace(",", "|") + "," + Ad + "n")
f.close()
I want to loop through the playbox indices 3-11.
I am not well versed with indices, so tried to do something like:
p = [str(i) for i in range (3,12)]
indaplaybox = playboxes[p]
for indaplaybox in playboxes:
rest of code
But that doesn't work, because what is probably obvious to most is that list indices must be integers.
I could really use some help thinking through how to get this for loop running smoothly. Thanks!
python for-loop web-scraping indices
python for-loop web-scraping indices
asked Jan 3 at 22:07
![](https://lh6.googleusercontent.com/-R2mLWCezEls/AAAAAAAAAAI/AAAAAAAAAMc/X0pYVNYqJg8/photo.jpg?sz=32)
![](https://lh6.googleusercontent.com/-R2mLWCezEls/AAAAAAAAAAI/AAAAAAAAAMc/X0pYVNYqJg8/photo.jpg?sz=32)
Jordan FreundlichJordan Freundlich
346
346
It's probably because bsoup didn't find anything for your requested query. I think it returnsNone
if it can't find anything.
– connectyourcharger
Jan 3 at 22:13
First instinct is thatindaplaybox.find('td', attrs = {'style' : 'font-weight: bold;'})
is returningNone
, and you can't call.text
on that. I see you may have duplicated some variable names;indaplaybox = playboxes[3]
. If it were me, I'd remove or change that line to make sure that's not the issue, thenprint(indaplaybox.find('td', attrs = {'style' : 'font-weight: bold;'}))
without calling.text
and see what is returned
– G. Anderson
Jan 3 at 22:13
Thanks for the help y'all!
– Jordan Freundlich
Jan 3 at 22:31
add a comment |
It's probably because bsoup didn't find anything for your requested query. I think it returnsNone
if it can't find anything.
– connectyourcharger
Jan 3 at 22:13
First instinct is thatindaplaybox.find('td', attrs = {'style' : 'font-weight: bold;'})
is returningNone
, and you can't call.text
on that. I see you may have duplicated some variable names;indaplaybox = playboxes[3]
. If it were me, I'd remove or change that line to make sure that's not the issue, thenprint(indaplaybox.find('td', attrs = {'style' : 'font-weight: bold;'}))
without calling.text
and see what is returned
– G. Anderson
Jan 3 at 22:13
Thanks for the help y'all!
– Jordan Freundlich
Jan 3 at 22:31
It's probably because bsoup didn't find anything for your requested query. I think it returns
None
if it can't find anything.– connectyourcharger
Jan 3 at 22:13
It's probably because bsoup didn't find anything for your requested query. I think it returns
None
if it can't find anything.– connectyourcharger
Jan 3 at 22:13
First instinct is that
indaplaybox.find('td', attrs = {'style' : 'font-weight: bold;'})
is returning None
, and you can't call .text
on that. I see you may have duplicated some variable names; indaplaybox = playboxes[3]
. If it were me, I'd remove or change that line to make sure that's not the issue, then print(indaplaybox.find('td', attrs = {'style' : 'font-weight: bold;'}))
without calling .text
and see what is returned– G. Anderson
Jan 3 at 22:13
First instinct is that
indaplaybox.find('td', attrs = {'style' : 'font-weight: bold;'})
is returning None
, and you can't call .text
on that. I see you may have duplicated some variable names; indaplaybox = playboxes[3]
. If it were me, I'd remove or change that line to make sure that's not the issue, then print(indaplaybox.find('td', attrs = {'style' : 'font-weight: bold;'}))
without calling .text
and see what is returned– G. Anderson
Jan 3 at 22:13
Thanks for the help y'all!
– Jordan Freundlich
Jan 3 at 22:31
Thanks for the help y'all!
– Jordan Freundlich
Jan 3 at 22:31
add a comment |
2 Answers
2
active
oldest
votes
You can do:
Method 1:
# p has all the values from playboxes at these indexes
p = [playboxes[i] for i in range(3,12)]
# now simple loop
for indaplaybox in p:
......
Method 2:
for indaplaybox in playboxes[3:12]:
....
1
Correct me if I'm wrong, but aren't you just taking values out of a list, adding them into a new list based on their indices in order, then iterating over the new list? You might as well just dop=playboxes.copy()
andfor indiaplaybox in p[3:12]
...
– G. Anderson
Jan 3 at 22:20
@G.Anderson yes, you are right, that's easier infact. I'll add it.
– YOLO
Jan 3 at 22:22
Y'all are freaking great. Thanks so much for the help!
– Jordan Freundlich
Jan 3 at 22:30
add a comment |
p = [str(i) for i in range (3,12)]
for i in p:
indaplaybox = playboxes[i]
...
rest of the code
Got it! Thanks for the help :)
– Jordan Freundlich
Jan 3 at 22:30
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54030453%2fneed-help-figuring-out-how-to-loop-through-indice%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can do:
Method 1:
# p has all the values from playboxes at these indexes
p = [playboxes[i] for i in range(3,12)]
# now simple loop
for indaplaybox in p:
......
Method 2:
for indaplaybox in playboxes[3:12]:
....
1
Correct me if I'm wrong, but aren't you just taking values out of a list, adding them into a new list based on their indices in order, then iterating over the new list? You might as well just dop=playboxes.copy()
andfor indiaplaybox in p[3:12]
...
– G. Anderson
Jan 3 at 22:20
@G.Anderson yes, you are right, that's easier infact. I'll add it.
– YOLO
Jan 3 at 22:22
Y'all are freaking great. Thanks so much for the help!
– Jordan Freundlich
Jan 3 at 22:30
add a comment |
You can do:
Method 1:
# p has all the values from playboxes at these indexes
p = [playboxes[i] for i in range(3,12)]
# now simple loop
for indaplaybox in p:
......
Method 2:
for indaplaybox in playboxes[3:12]:
....
1
Correct me if I'm wrong, but aren't you just taking values out of a list, adding them into a new list based on their indices in order, then iterating over the new list? You might as well just dop=playboxes.copy()
andfor indiaplaybox in p[3:12]
...
– G. Anderson
Jan 3 at 22:20
@G.Anderson yes, you are right, that's easier infact. I'll add it.
– YOLO
Jan 3 at 22:22
Y'all are freaking great. Thanks so much for the help!
– Jordan Freundlich
Jan 3 at 22:30
add a comment |
You can do:
Method 1:
# p has all the values from playboxes at these indexes
p = [playboxes[i] for i in range(3,12)]
# now simple loop
for indaplaybox in p:
......
Method 2:
for indaplaybox in playboxes[3:12]:
....
You can do:
Method 1:
# p has all the values from playboxes at these indexes
p = [playboxes[i] for i in range(3,12)]
# now simple loop
for indaplaybox in p:
......
Method 2:
for indaplaybox in playboxes[3:12]:
....
edited Jan 3 at 22:23
answered Jan 3 at 22:16
![](https://i.stack.imgur.com/7A5cq.jpg?s=32&g=1)
![](https://i.stack.imgur.com/7A5cq.jpg?s=32&g=1)
YOLOYOLO
5,5531425
5,5531425
1
Correct me if I'm wrong, but aren't you just taking values out of a list, adding them into a new list based on their indices in order, then iterating over the new list? You might as well just dop=playboxes.copy()
andfor indiaplaybox in p[3:12]
...
– G. Anderson
Jan 3 at 22:20
@G.Anderson yes, you are right, that's easier infact. I'll add it.
– YOLO
Jan 3 at 22:22
Y'all are freaking great. Thanks so much for the help!
– Jordan Freundlich
Jan 3 at 22:30
add a comment |
1
Correct me if I'm wrong, but aren't you just taking values out of a list, adding them into a new list based on their indices in order, then iterating over the new list? You might as well just dop=playboxes.copy()
andfor indiaplaybox in p[3:12]
...
– G. Anderson
Jan 3 at 22:20
@G.Anderson yes, you are right, that's easier infact. I'll add it.
– YOLO
Jan 3 at 22:22
Y'all are freaking great. Thanks so much for the help!
– Jordan Freundlich
Jan 3 at 22:30
1
1
Correct me if I'm wrong, but aren't you just taking values out of a list, adding them into a new list based on their indices in order, then iterating over the new list? You might as well just do
p=playboxes.copy()
and for indiaplaybox in p[3:12]
...– G. Anderson
Jan 3 at 22:20
Correct me if I'm wrong, but aren't you just taking values out of a list, adding them into a new list based on their indices in order, then iterating over the new list? You might as well just do
p=playboxes.copy()
and for indiaplaybox in p[3:12]
...– G. Anderson
Jan 3 at 22:20
@G.Anderson yes, you are right, that's easier infact. I'll add it.
– YOLO
Jan 3 at 22:22
@G.Anderson yes, you are right, that's easier infact. I'll add it.
– YOLO
Jan 3 at 22:22
Y'all are freaking great. Thanks so much for the help!
– Jordan Freundlich
Jan 3 at 22:30
Y'all are freaking great. Thanks so much for the help!
– Jordan Freundlich
Jan 3 at 22:30
add a comment |
p = [str(i) for i in range (3,12)]
for i in p:
indaplaybox = playboxes[i]
...
rest of the code
Got it! Thanks for the help :)
– Jordan Freundlich
Jan 3 at 22:30
add a comment |
p = [str(i) for i in range (3,12)]
for i in p:
indaplaybox = playboxes[i]
...
rest of the code
Got it! Thanks for the help :)
– Jordan Freundlich
Jan 3 at 22:30
add a comment |
p = [str(i) for i in range (3,12)]
for i in p:
indaplaybox = playboxes[i]
...
rest of the code
p = [str(i) for i in range (3,12)]
for i in p:
indaplaybox = playboxes[i]
...
rest of the code
answered Jan 3 at 22:12
mamunmamun
1,28811014
1,28811014
Got it! Thanks for the help :)
– Jordan Freundlich
Jan 3 at 22:30
add a comment |
Got it! Thanks for the help :)
– Jordan Freundlich
Jan 3 at 22:30
Got it! Thanks for the help :)
– Jordan Freundlich
Jan 3 at 22:30
Got it! Thanks for the help :)
– Jordan Freundlich
Jan 3 at 22:30
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54030453%2fneed-help-figuring-out-how-to-loop-through-indice%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
O9xBuiHa0 tRfQw eCA3om 3dy,pZnJvKL9rBY,AWb,N,OR2m9
It's probably because bsoup didn't find anything for your requested query. I think it returns
None
if it can't find anything.– connectyourcharger
Jan 3 at 22:13
First instinct is that
indaplaybox.find('td', attrs = {'style' : 'font-weight: bold;'})
is returningNone
, and you can't call.text
on that. I see you may have duplicated some variable names;indaplaybox = playboxes[3]
. If it were me, I'd remove or change that line to make sure that's not the issue, thenprint(indaplaybox.find('td', attrs = {'style' : 'font-weight: bold;'}))
without calling.text
and see what is returned– G. Anderson
Jan 3 at 22:13
Thanks for the help y'all!
– Jordan Freundlich
Jan 3 at 22:31