parse_table

pyrcs.parser.parse_table(source, parser='html.parser', as_dataframe=False)

Parse HTML <tr> elements for creating a data frame.

Parameters
  • source (requests.Response) – response object to connecting a URL to request a table

  • parser (str) – 'html.parser' (default), 'html5lib' or 'lxml'

  • as_dataframe (bool) – whether to return the parsed data in tabular form

Returns

a list of lists each comprising a row of the requested table (see also pyrcs.utils.parse_tr()) and a list of column names of the requested table

Return type

tuple[list, list] or pandas.DataFrame or list

Examples:

>>> from pyrcs.parser import parse_table
>>> import requests

>>> source_dat = requests.get(url='http://www.railwaycodes.org.uk/elrs/elra.shtm')

>>> columns_dat, records_dat = parse_table(source_dat)

>>> columns_dat
['ELR', 'Line name', 'Mileages', 'Datum', 'Notes']
>>> type(records_dat)
list
>>> len(records_dat) // 100
1
>>> records_dat[0]
['AAL',
 'Ashendon and Aynho Line',
 '0.00 - 18.29',
 'Ashendon Junction',
 'Now NAJ3']