utils¶
A module of helper functions.
Source homepage¶
homepage_url () |
Specify the homepage URL of the data source. |
-
pyrcs.utils.
homepage_url
()[source]¶ Specify the homepage URL of the data source.
Returns: URL of the data source homepage Return type: str
Directory¶
cd_dat (*sub_dir[, dat_dir, mkdir]) |
Change directory to dat_dir/ and sub-directories within a package. |
-
pyrcs.utils.
cd_dat
(*sub_dir, dat_dir='dat', mkdir=False, **kwargs)[source]¶ Change directory to dat_dir/ and sub-directories within a package.
Parameters: - sub_dir (str) – name of directory; names of directories (and/or a filename)
- dat_dir (str) – name of a directory to store data, defaults to
"dat"
- mkdir (bool) – whether to create a directory, defaults to
False
- kwargs – optional parameters of os.makedirs, e.g.
mode=0o777
Returns: a full path to a directory (or a file) under
data_dir
Return type: str
Example:
from pyrcs.utils import cd_dat dat_dir = "dat" mkdir = False cd_dat("line-data", dat_dir=dat_dir, mkdir=mkdir) # "\dat\line-data"
Converters¶
mile_chain_to_nr_mileage (miles_chains) |
Convert mileage data in the form ‘<miles>.<chains>’ to Network Rail mileage. |
nr_mileage_to_mile_chain (str_mileage) |
Convert Network Rail mileage to the form ‘<miles>.<chains>’. |
nr_mileage_str_to_num (str_mileage) |
Convert string-type Network Rail mileage to numerical-type one. |
nr_mileage_num_to_str (num_mileage) |
Convert numerical-type Network Rail mileage to string-type one. |
nr_mileage_to_yards (nr_mileage) |
Convert Network Rail mileages to yards. |
yards_to_nr_mileage (yards) |
Convert yards to Network Rail mileages. |
shift_num_nr_mileage (nr_mileage, shift_yards) |
Shift Network Rail mileage by given yards. |
year_to_financial_year (date) |
Convert calendar year of a given date to Network Rail financial year. |
-
pyrcs.utils.
mile_chain_to_nr_mileage
(miles_chains)[source]¶ Convert mileage data in the form ‘<miles>.<chains>’ to Network Rail mileage.
Parameters: miles_chains (str, numpy.nan, None) – mileage data presented in the form ‘<miles>.<chains>’ Returns: Network Rail mileage in the form ‘<miles>.<yards>’ Return type: str Examples:
from pyrcs.utils import mile_chain_to_nr_mileage miles_chains = '0.18' # AAM 0.18 Tewkesbury Junction with ANZ (84.62) mile_chain_to_nr_mileage(miles_chains) # '0.0396' miles_chains = None # or np.nan, or '' mile_chain_to_nr_mileage(miles_chains) # ''
-
pyrcs.utils.
nr_mileage_to_mile_chain
(str_mileage)[source]¶ Convert Network Rail mileage to the form ‘<miles>.<chains>’.
Parameters: str_mileage (str, numpy.nan, None) – Network Rail mileage data presented in the form ‘<miles>.<yards>’ Returns: ‘<miles>.<chains>’ Return type: str Examples:
from pyrcs.utils import nr_mileage_to_mile_chain str_mileage = '0.0396' nr_mileage_to_mile_chain(str_mileage) # '0.18' str_mileage = None # or np.nan, or '' nr_mileage_to_mile_chain(str_mileage) # ''
-
pyrcs.utils.
nr_mileage_str_to_num
(str_mileage)[source]¶ Convert string-type Network Rail mileage to numerical-type one.
Parameters: str_mileage (str) – string-type Network Rail mileage in the form ‘<miles>.<yards>’ Returns: numerical-type Network Rail mileage Return type: float Examples:
from pyrcs.utils import nr_mileage_str_to_num str_mileage = '0.0396' nr_mileage_str_to_num(str_mileage) # 0.0396 str_mileage = '' nr_mileage_str_to_num(str_mileage) # nan
-
pyrcs.utils.
nr_mileage_num_to_str
(num_mileage)[source]¶ Convert numerical-type Network Rail mileage to string-type one.
Parameters: num_mileage (float) – numerical-type Network Rail mileage Returns: string-type Network Rail mileage in the form ‘<miles>.<yards>’ Return type: str Examples:
import numpy as np from pyrcs.utils import nr_mileage_num_to_str num_mileage = 0.0396 nr_mileage_num_to_str(num_mileage) # '0.0396' num_mileage = np.nan nr_mileage_num_to_str(num_mileage) # ''
-
pyrcs.utils.
nr_mileage_to_yards
(nr_mileage)[source]¶ Convert Network Rail mileages to yards.
Parameters: nr_mileage (float, str) – Network Rail mileage Returns: yards Return type: int Examples:
from pyrcs.utils import nr_mileage_to_yards nr_mileage = '0.0396' nr_mileage_to_yards(nr_mileage) # 396 nr_mileage = 0.0396 nr_mileage_to_yards(nr_mileage) # 396
-
pyrcs.utils.
yards_to_nr_mileage
(yards)[source]¶ Convert yards to Network Rail mileages.
Parameters: yards (int, float, numpy.nan, None) – yards Returns: Network Rail mileage in the form ‘<miles>.<yards>’ Return type: str Examples:
from pyrcs.utils import yards_to_nr_mileage yards = 396 yards_to_nr_mileage(yards) # '0.0396' yards = 396.0 yards_to_nr_mileage(yards) # '0.0396' yards = None yards_to_nr_mileage(yards) # ''
-
pyrcs.utils.
shift_num_nr_mileage
(nr_mileage, shift_yards)[source]¶ Shift Network Rail mileage by given yards.
Parameters: - nr_mileage (float, int, str) – Network Rail mileage
- shift_yards (int, float) – yards by which the given
nr_mileage
is shifted
Returns: shifted numerical Network Rail mileage
Return type: float
Examples:
from pyrcs.utils import shift_num_nr_mileage nr_mileage = '0.0396' # or 0.0396 shift_yards = 220 shift_num_nr_mileage(nr_mileage, shift_yards) # 0.0616 nr_mileage = '0.0396' shift_yards = 220.99 shift_num_nr_mileage(nr_mileage, shift_yards) # 0.0617 nr_mileage = 10 shift_yards = 220 shift_num_nr_mileage(nr_mileage, shift_yards) # 10.022
-
pyrcs.utils.
year_to_financial_year
(date)[source]¶ Convert calendar year of a given date to Network Rail financial year.
Parameters: date (datetime.datetime) – date Returns: Network Rail financial year of the given date
Return type: int Example:
from pyrcs.utils import year_to_financial_year date = datetime.datetime.now() year_to_financial_year(date) # 2020
Parsers¶
parse_tr (header, trs) |
Parse a list of parsed HTML <tr> elements. |
parse_table (source[, parser]) |
Parse HTML <tr> elements for creating a data frame. |
parse_location_name (location_name) |
Parse location name (and its associated note). |
parse_date (str_date[, as_date_type]) |
Parse a date. |
-
pyrcs.utils.
parse_tr
(header, trs)[source]¶ Parse a list of parsed HTML <tr> elements.
See also [PT-1].
Parameters: - header (list) – list of column names of a requested table
- trs (bs4.ResultSet - list of bs4.Tag) – contents under <tr> tags of a web page
Returns: list of lists with each comprising a row of the requested table
Return type: list
Example:
import bs4 import fake_useragent from pyrcs.utils import fake_requests_headers, parse_tr source = requests.get( 'http://www.railwaycodes.org.uk/elrs/elra.shtm', headers=fake_requests_headers()) parsed_text = bs4.BeautifulSoup(source.text, 'lxml') header = [x.text for x in parsed_text.find_all('th')] # Column names trs = parsed_text.find_all('tr') parse_tr(header, trs) # returns a list of lists
-
pyrcs.utils.
parse_table
(source, parser='lxml')[source]¶ Parse HTML <tr> elements for creating a data frame.
Parameters: - source (requests.Response) – response object to connecting a URL to request a table
- parser (str) –
'lxml'
(default),'html5lib'
or'html.parser'
Returns: - a list of lists each comprising a row of the requested table (see also parse_tr()) and
- a list of column names of the requested table
Return type: tuple
Examples:
import bs4 import fake_useragent from pyrcs.utils import fake_requests_headers, parse_table source = requests.get( 'http://www.railwaycodes.org.uk/elrs/elra.shtm', headers=fake_requests_headers()) parser = 'lxml' parse_table(source, parser)
-
pyrcs.utils.
parse_location_name
(location_name)[source]¶ Parse location name (and its associated note).
Parameters: location_name (str, None) – location name (in raw data) Returns: location name and, if any, note Return type: tuple Examples:
from pyrcs.utils import parse_location_name location_dat = 'Abbey Wood' parse_location_name(location_dat) # ('Abbey Wood', '') location_dat = None parse_location_name(location_dat) # ('', '') location_dat = 'Abercynon (formerly Abercynon South)' parse_location_name(location_dat) # ('Abercynon', 'formerly Abercynon South') location_dat = 'Allerton (reopened as Liverpool South Parkway)' parse_location_name(location_dat) # ('Allerton', 'reopened as Liverpool South Parkway') location_dat = 'Ashford International [domestic portion]' parse_location_name(location_dat) # ('Ashford International', 'domestic portion')
-
pyrcs.utils.
parse_date
(str_date, as_date_type=False)[source]¶ Parse a date.
Parameters: - str_date (str) – string-type date
- as_date_type (bool) – whether to return the date as datetime.date, defaults to
False
Returns: parsed date as a string or datetime.date
Return type: str, datetime.date
Examples:
from pyrcs.utils import parse_date str_date = '2020-01-01' as_date_type = True parse_date(str_date, as_date_type) # datetime.date(2020, 1, 1)
Get useful information¶
fake_requests_headers ([randomized]) |
Make a fake HTTP headers for requests.get. |
get_last_updated_date (url[, parsed, …]) |
Get last update date. |
get_catalogue (page_url[, update, …]) |
Get the catalogue for a class. |
get_category_menu (menu_url[, update, …]) |
Get a menu of the available classes. |
get_station_data_catalogue (source_url, …) |
Get catalogue of railway station data. |
get_track_diagrams_items (source_url, source_key) |
Get catalogue of track diagrams. |
-
pyrcs.utils.
fake_requests_headers
(randomized=False)[source]¶ Make a fake HTTP headers for requests.get.
Parameters: randomized (bool) – whether to go for a random agent, defaults to False
Returns: fake HTTP headers Return type: dict Examples:
>>> from pyhelpers.ops import fake_requests_headers >>> fake_headers_ = fake_requests_headers() >>> print(fake_headers_) {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ch... >>> fake_headers_ = fake_requests_headers(randomized=True) >>> print(fake_headers_) {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 (KHTML ...
Note
The above
fake_headers_
may be different every time we run the examples.
-
pyrcs.utils.
get_last_updated_date
(url, parsed=True, as_date_type=False)[source]¶ Get last update date.
Parameters: - url (str) – URL link of a requested web page
- parsed (bool) – whether to reformat the date, defaults to
True
- as_date_type (bool) – whether to return the date as datetime.date, defaults to
False
Returns: date of when the specified web page was last updated
Return type: str, datetime.date, None
Examples:
from pyrcs.utils import get_last_updated_date parsed = True url = 'http://www.railwaycodes.org.uk/crs/CRSa.shtm' date_type = False get_last_updated_date(url, parsed, date_type) # '<year>-<month>-<day>' date_type = True get_last_updated_date(url, parsed, date_type) # datetime.date(<year>, <month>, <day>) url = 'http://www.railwaycodes.org.uk/linedatamenu.shtm' get_last_updated_date(url, parsed, date_type) # None
-
pyrcs.utils.
get_catalogue
(page_url, update=False, confirmation_required=True, json_it=True, verbose=False)[source]¶ Get the catalogue for a class.
Parameters: - page_url (str) – URL of the main page of a code category
- update (bool) – whether to check on update and proceed to update the package data, defaults to
False
- confirmation_required (bool) – whether to prompt a message for confirmation to proceed, defaults to
True
- json_it (bool) – whether to save the catalogue as a .json file, defaults to
True
- verbose (bool) – whether to print relevant information in console as the function runs, defaults to
False
Returns: catalogue in the form {‘<title>’: ‘<URL>’}
Return type: dict
Examples:
from pyrcs.utils import get_catalogue update = False verbose = True page_url = 'http://www.railwaycodes.org.uk/elrs/elr0.shtm' confirmation_required = True catalogue = get_catalogue(page_url, update, confirmation_required, verbose) page_url = 'http://www.railwaycodes.org.uk/linedatamenu.shtm' confirmation_required = False catalogue = get_catalogue(page_url, update, confirmation_required, verbose)
Get a menu of the available classes.
Parameters: - menu_url (str) – URL of the menu page
- update (bool) – whether to check on update and proceed to update the package data, defaults to
False
- confirmation_required (bool) – whether to prompt a message for confirmation to proceed, defaults to
True
- json_it (bool) – whether to save the catalogue as a .json file, defaults to
True
- verbose (bool) – whether to print relevant information in console as the function runs, defaults to
False
Returns: Return type: dict
Example:
from pyrcs.utils import get_category_menu update = False confirmation_required = True verbose = True menu_url = 'http://www.railwaycodes.org.uk/linedatamenu.shtm' cls_menu = get_category_menu(menu_url) print(cls_menu) # {'<category name>': {'<title>': '<URL>'}}
-
pyrcs.utils.
get_station_data_catalogue
(source_url, source_key, update=False)[source]¶ Get catalogue of railway station data.
Parameters: - source_url (str) – URL to the source web page
- source_key (str) – key of the returned catalogue (which is a dictionary)
- update (bool) – whether to check on update and proceed to update the package data, defaults to
False
Returns: catalogue of railway station data
Return type: dict
-
pyrcs.utils.
get_track_diagrams_items
(source_url, source_key, update=False)[source]¶ Get catalogue of track diagrams.
Parameters: - source_url (str) – URL to the source web page
- source_key (str) – key of the returned catalogue (which is a dictionary)
- update (bool) – whether to check on update and proceed to update the package data, defaults to
False
Returns: catalogue of railway station data
Return type: dict
Rectification of location names¶
fetch_location_names_repl_dict ([k, regex, …]) |
Create a dictionary for rectifying location names. |
update_location_name_repl_dict (new_items, regex) |
Update the location-name replacement dictionary in the package data. |
-
pyrcs.utils.
fetch_location_names_repl_dict
(k=None, regex=False, as_dataframe=False)[source]¶ Create a dictionary for rectifying location names.
Parameters: - k (str, int, float, bool, None) – key of the created dictionary, defaults to
None
- regex (bool) – whether to create a dictionary for replacement based on regular expressions, defaults to
False
- as_dataframe (bool) – whether to return the created dictionary as a pandas.DataFrame, defaults to
False
Returns: dictionary for rectifying location names
Return type: dict, pandas.DataFrame
Examples:
from pyrcs.utils import fetch_location_names_repl_dict k = None regex = False as_dataframe = True fetch_location_names_repl_dict(k, regex, as_dataframe) regex = True as_dataframe = False fetch_location_names_repl_dict(k, regex, as_dataframe)
- k (str, int, float, bool, None) – key of the created dictionary, defaults to
-
pyrcs.utils.
update_location_name_repl_dict
(new_items, regex, verbose=False)[source]¶ Update the location-name replacement dictionary in the package data.
Parameters: - new_items (dict) – new items to replace
- regex (bool) – whether this update is for regular-expression dictionary
- verbose (bool) – whether to print relevant information in console as the function runs, defaults to
False
Example:
from pyrcs.utils import update_location_name_repl_dict
verbose = True
new_items = {} regex = False update_location_name_repl_dict(new_items, regex, verbose)
Fixers¶
fix_num_stanox (stanox_code) |
Fix ‘STANOX’ if it is loaded as numbers. |
-
pyrcs.utils.
fix_num_stanox
(stanox_code)[source]¶ Fix ‘STANOX’ if it is loaded as numbers.
Parameters: stanox_code (str, int) – STANOX code Returns: standard STANOX code Return type: str Examples:
stanox_code = 65630 fix_num_stanox(stanox_code) # '65630' stanox_code = 2071 fix_num_stanox(stanox_code) # '02071'
Misc¶
is_str_float (str_val) |
Check if a string-type variable can express a float value. |
-
pyrcs.utils.
is_str_float
(str_val)[source]¶ Check if a string-type variable can express a float value.
Parameters: str_val (str) – a string-type variable Returns: whether str_val
can express a float valueReturn type: bool Examples:
str_val = '' is_str_float(str_val) # False str_val = 'a' is_str_float(str_val) # False str_val = '1' is_str_float(str_val) # True str_val = '1.1' is_str_float(str_val) # True