parse_table¶
- pyrcs.parser.parse_table(source, parser='html.parser', as_dataframe=False)[source]¶
Parses HTML
<tr>
elements to create a table from the given source.This function extracts data from the
<thead>
and<tbody>
elements of an HTML table and processes it into a list of lists (rows of the table) or a dataframe.- Parameters:
source (requests.Response) – The response object containing the HTML table from a requested URL.
parser (str) – The parser to use for processing the HTML; options are
'html.parser'
(default),'html5lib'
or'lxml'
.as_dataframe (bool) – If
True
, the parsed data is returned as a dataframe. IfFalse
, it returns a list of lists and column names; defaults toFalse
.
- Returns:
A tuple containing a list of column names and a list of lists representing rows of the table; if
as_dataframe=True
, returns a dataframe.- Return type:
tuple[list, list] | pandas.DataFrame | list
Examples:
>>> from pyrcs.parser import parse_table >>> import requests >>> source_dat = requests.get(url='http://www.railwaycodes.org.uk/elrs/elra.shtm') >>> columns_dat, records_dat = parse_table(source_dat) >>> columns_dat ['ELR', 'Line name', 'Mileages', 'Datum', 'Notes'] >>> type(records_dat) list >>> len(records_dat) // 100 1 >>> records_dat[0] ['AAL', 'Ashendon and Aynho Line', '0.00 - 18.29', 'Ashendon Junction', 'Now NAJ3']