Files

82 lines
2.5 KiB
Markdown
Raw Permalink Normal View History

2017-11-10 18:19:24 +08:00
# Chapter 11 Web Scraping
> Q: 1. Briefly describe the differences between the webbrowser, requests, BeautifulSoup, and selenium modules.
`webbrowser` can launch a web browser to a specific URL by `open()`;
`requests` can download files and pages from the Web.
`beautifulSoup` module parses HTML.
`selenium` launch and control a browser.
> Q: 2. What type of object is returned by requests.get()? How can you access the downloaded content as a string value?
Response object. `getText()`.
> Q: 3. What Requests method checks that the download worked?
`raise_for_status()`
> Q: 4. How can you get the HTTP status code of a Requests response?
`status_code`
> Q: 5. How do you save a Requests response to a file?
```py
saveFile = open('SaveFile', 'wb')
for chunk in res.iter_content(100000):
saveFile.write(chunk)
saveFile.close()
```
> Q: 6. What is the keyboard shortcut for opening a browsers developer tools?
F12
> Q: 7. How can you view (in the developer tools) the HTML of a specific element on a web page?
Inspect Element
> Q: 8. What is the CSS selector string that would find the element with an id attribute of main?
`#main`
> Q: 9. What is the CSS selector string that would find the elements with a CSS class of highlight?
`.highlight`
> Q: 10. What is the CSS selector string that would find all the `<div>` elements inside another `<div>` element?
`div div`
> Q: 11. What is the CSS selector string that would find the `<button>` element with a value attribute set to favorite?
`button[value='favorite']`
> Q: 12. Say you have a Beautiful Soup Tag object stored in the variable spam for the element `<div>`Hello world!`</div>`. How could you get a string 'Hello world!' from the Tag object?
spam.getText()
> Q: 13. How would you store all the attributes of a Beautiful Soup Tag object in a variable named linkElem?
linkElem.attrs
> Q: 14. Running import selenium doesnt work. How do you properly import the selenium module?
`from selenium import webdriver`
> Q: 15. Whats the difference between the find_element_* and find_elements_* methods?
one vs a list.
> Q: 16. What methods do Seleniums WebElement objects have for simulating mouse clicks and keyboard keys?
`click()` and `send_keys()`
> Q: 17. You could call send_keys(Keys.ENTER) on the Submit buttons WebElement object, but what is an easier way to submit a form with Selenium?
`submit()`
> Q: 18. How can you simulate clicking a browsers Forward, Back, and Refresh buttons with Selenium?
`forward()`, `back()`, `refresh()`