get_hypertext¶
- pyrcs.parser.get_hypertext(hypertext_tag, hyperlink_tag_name='a', md_style=True)[source]¶
Gets hyperlinked text from a specified HTML tag.
This function scrapes hypertext content, optionally returning it in Markdown format if requested.
- Parameters:
hypertext_tag (bs4.element.Tag | bs4.element.PageElement) – The tag containing hyperlinked text.
hyperlink_tag_name (str) – The tag name of the hyperlink within the hypertext; defaults to
'a'
.md_style (bool) – Whether to return the hypertext in Markdown style, defaults to
True
.
- Returns:
The hypertext.
- Return type:
str
Examples:
>>> from pyrcs.parser import get_hypertext >>> from pyrcs.line_data import Electrification >>> import bs4 >>> import requests >>> elec = Electrification() >>> url = elec.catalogue[elec.KEY_TO_INDEPENDENT_LINES] >>> source = requests.get(url) >>> soup = bs4.BeautifulSoup(source.content, 'html.parser') >>> h3 = soup.find('h3') >>> p = h3.find_all_next('p')[8] >>> p <p>Croydon Tramlink mast references can be found on the <a href="http://www.croydon-traml... >>> hyper_txt = get_hypertext(hypertext_tag=p, md_style=True) >>> hyper_txt 'Croydon Tramlink mast references can be found on the [Croydon Tramlink Unofficial Site](...