utils

A module of helper functions.

Source homepage

homepage_url() Specify the homepage URL of the data source.
pyrcs.utils.homepage_url()[source]

Specify the homepage URL of the data source.

Returns:URL of the data source homepage
Return type:str

Directory

cd_dat(*sub_dir[, dat_dir, mkdir]) Change directory to dat_dir/ and sub-directories within a package.
pyrcs.utils.cd_dat(*sub_dir, dat_dir='dat', mkdir=False, **kwargs)[source]

Change directory to dat_dir/ and sub-directories within a package.

Parameters:
  • sub_dir (str) – name of directory; names of directories (and/or a filename)
  • dat_dir (str) – name of a directory to store data, defaults to "dat"
  • mkdir (bool) – whether to create a directory, defaults to False
  • kwargs – optional parameters of os.makedirs, e.g. mode=0o777
Returns:

a full path to a directory (or a file) under data_dir

Return type:

str

Example:

from pyrcs.utils import cd_dat

dat_dir = "dat"
mkdir = False

cd_dat("line-data", dat_dir=dat_dir, mkdir=mkdir)
# "\dat\line-data"

Converters

mile_chain_to_nr_mileage(miles_chains) Convert mileage data in the form ‘<miles>.<chains>’ to Network Rail mileage.
nr_mileage_to_mile_chain(str_mileage) Convert Network Rail mileage to the form ‘<miles>.<chains>’.
nr_mileage_str_to_num(str_mileage) Convert string-type Network Rail mileage to numerical-type one.
nr_mileage_num_to_str(num_mileage) Convert numerical-type Network Rail mileage to string-type one.
nr_mileage_to_yards(nr_mileage) Convert Network Rail mileages to yards.
yards_to_nr_mileage(yards) Convert yards to Network Rail mileages.
shift_num_nr_mileage(nr_mileage, shift_yards) Shift Network Rail mileage by given yards.
year_to_financial_year(date) Convert calendar year of a given date to Network Rail financial year.
pyrcs.utils.mile_chain_to_nr_mileage(miles_chains)[source]

Convert mileage data in the form ‘<miles>.<chains>’ to Network Rail mileage.

Parameters:miles_chains (str, numpy.nan, None) – mileage data presented in the form ‘<miles>.<chains>’
Returns:Network Rail mileage in the form ‘<miles>.<yards>’
Return type:str

Examples:

from pyrcs.utils import mile_chain_to_nr_mileage

miles_chains = '0.18'  # AAM 0.18 Tewkesbury Junction with ANZ (84.62)
mile_chain_to_nr_mileage(miles_chains)  # '0.0396'

miles_chains = None  # or np.nan, or ''
mile_chain_to_nr_mileage(miles_chains)  # ''
pyrcs.utils.nr_mileage_to_mile_chain(str_mileage)[source]

Convert Network Rail mileage to the form ‘<miles>.<chains>’.

Parameters:str_mileage (str, numpy.nan, None) – Network Rail mileage data presented in the form ‘<miles>.<yards>’
Returns:‘<miles>.<chains>’
Return type:str

Examples:

from pyrcs.utils import nr_mileage_to_mile_chain

str_mileage = '0.0396'
nr_mileage_to_mile_chain(str_mileage)  # '0.18'

str_mileage = None  # or np.nan, or ''
nr_mileage_to_mile_chain(str_mileage)  # ''
pyrcs.utils.nr_mileage_str_to_num(str_mileage)[source]

Convert string-type Network Rail mileage to numerical-type one.

Parameters:str_mileage (str) – string-type Network Rail mileage in the form ‘<miles>.<yards>’
Returns:numerical-type Network Rail mileage
Return type:float

Examples:

from pyrcs.utils import nr_mileage_str_to_num

str_mileage = '0.0396'
nr_mileage_str_to_num(str_mileage)  # 0.0396

str_mileage = ''
nr_mileage_str_to_num(str_mileage)  # nan
pyrcs.utils.nr_mileage_num_to_str(num_mileage)[source]

Convert numerical-type Network Rail mileage to string-type one.

Parameters:num_mileage (float) – numerical-type Network Rail mileage
Returns:string-type Network Rail mileage in the form ‘<miles>.<yards>’
Return type:str

Examples:

import numpy as np
from pyrcs.utils import nr_mileage_num_to_str

num_mileage = 0.0396
nr_mileage_num_to_str(num_mileage)  # '0.0396'

num_mileage = np.nan
nr_mileage_num_to_str(num_mileage)  # ''
pyrcs.utils.nr_mileage_to_yards(nr_mileage)[source]

Convert Network Rail mileages to yards.

Parameters:nr_mileage (float, str) – Network Rail mileage
Returns:yards
Return type:int

Examples:

from pyrcs.utils import nr_mileage_to_yards

nr_mileage = '0.0396'
nr_mileage_to_yards(nr_mileage)  # 396

nr_mileage = 0.0396
nr_mileage_to_yards(nr_mileage)  # 396
pyrcs.utils.yards_to_nr_mileage(yards)[source]

Convert yards to Network Rail mileages.

Parameters:yards (int, float, numpy.nan, None) – yards
Returns:Network Rail mileage in the form ‘<miles>.<yards>’
Return type:str

Examples:

from pyrcs.utils import yards_to_nr_mileage

yards = 396
yards_to_nr_mileage(yards)  # '0.0396'

yards = 396.0
yards_to_nr_mileage(yards)  # '0.0396'

yards = None
yards_to_nr_mileage(yards)  # ''
pyrcs.utils.shift_num_nr_mileage(nr_mileage, shift_yards)[source]

Shift Network Rail mileage by given yards.

Parameters:
  • nr_mileage (float, int, str) – Network Rail mileage
  • shift_yards (int, float) – yards by which the given nr_mileage is shifted
Returns:

shifted numerical Network Rail mileage

Return type:

float

Examples:

from pyrcs.utils import shift_num_nr_mileage

nr_mileage = '0.0396'  # or 0.0396
shift_yards = 220
shift_num_nr_mileage(nr_mileage, shift_yards)  # 0.0616

nr_mileage = '0.0396'
shift_yards = 220.99
shift_num_nr_mileage(nr_mileage, shift_yards)  # 0.0617

nr_mileage = 10
shift_yards = 220
shift_num_nr_mileage(nr_mileage, shift_yards)  # 10.022
pyrcs.utils.year_to_financial_year(date)[source]

Convert calendar year of a given date to Network Rail financial year.

Parameters:date (datetime.datetime) – date
Returns:Network Rail financial year of the given date
Return type:int

Example:

from pyrcs.utils import year_to_financial_year

date = datetime.datetime.now()

year_to_financial_year(date)  # 2020

Parsers

parse_tr(header, trs) Parse a list of parsed HTML <tr> elements.
parse_table(source[, parser]) Parse HTML <tr> elements for creating a data frame.
parse_location_name(location_name) Parse location name (and its associated note).
parse_date(str_date[, as_date_type]) Parse a date.
pyrcs.utils.parse_tr(header, trs)[source]

Parse a list of parsed HTML <tr> elements.

See also [PT-1].

Parameters:
  • header (list) – list of column names of a requested table
  • trs (bs4.ResultSet - list of bs4.Tag) – contents under <tr> tags of a web page
Returns:

list of lists with each comprising a row of the requested table

Return type:

list

Example:

import bs4
import fake_useragent
from pyrcs.utils import fake_requests_headers, parse_tr

source = requests.get(
    'http://www.railwaycodes.org.uk/elrs/elra.shtm',
    headers=fake_requests_headers())
parsed_text = bs4.BeautifulSoup(source.text, 'lxml')
header = [x.text for x in parsed_text.find_all('th')]  # Column names
trs = parsed_text.find_all('tr')

parse_tr(header, trs)  # returns a list of lists
pyrcs.utils.parse_table(source, parser='lxml')[source]

Parse HTML <tr> elements for creating a data frame.

Parameters:
  • source (requests.Response) – response object to connecting a URL to request a table
  • parser (str) – 'lxml' (default), 'html5lib' or 'html.parser'
Returns:

  • a list of lists each comprising a row of the requested table (see also parse_tr()) and
  • a list of column names of the requested table

Return type:

tuple

Examples:

import bs4
import fake_useragent
from pyrcs.utils import fake_requests_headers, parse_table

source = requests.get(
    'http://www.railwaycodes.org.uk/elrs/elra.shtm',
    headers=fake_requests_headers())
parser = 'lxml'

parse_table(source, parser)
pyrcs.utils.parse_location_name(location_name)[source]

Parse location name (and its associated note).

Parameters:location_name (str, None) – location name (in raw data)
Returns:location name and, if any, note
Return type:tuple

Examples:

from pyrcs.utils import parse_location_name

location_dat = 'Abbey Wood'
parse_location_name(location_dat)
# ('Abbey Wood', '')

location_dat = None
parse_location_name(location_dat)
# ('', '')

location_dat = 'Abercynon (formerly Abercynon South)'
parse_location_name(location_dat)
# ('Abercynon', 'formerly Abercynon South')

location_dat = 'Allerton (reopened as Liverpool South Parkway)'
parse_location_name(location_dat)
# ('Allerton', 'reopened as Liverpool South Parkway')

location_dat = 'Ashford International [domestic portion]'
parse_location_name(location_dat)
# ('Ashford International', 'domestic portion')
pyrcs.utils.parse_date(str_date, as_date_type=False)[source]

Parse a date.

Parameters:
  • str_date (str) – string-type date
  • as_date_type (bool) – whether to return the date as datetime.date, defaults to False
Returns:

parsed date as a string or datetime.date

Return type:

str, datetime.date

Examples:

from pyrcs.utils import parse_date

str_date = '2020-01-01'

as_date_type = True
parse_date(str_date, as_date_type)  # datetime.date(2020, 1, 1)

Get useful information

fake_requests_headers([randomized]) Make a fake HTTP headers for requests.get.
get_last_updated_date(url[, parsed, …]) Get last update date.
get_catalogue(page_url[, update, …]) Get the catalogue for a class.
get_category_menu(menu_url[, update, …]) Get a menu of the available classes.
get_station_data_catalogue(source_url, …) Get catalogue of railway station data.
get_track_diagrams_items(source_url, source_key) Get catalogue of track diagrams.
pyrcs.utils.fake_requests_headers(randomized=False)[source]

Make a fake HTTP headers for requests.get.

Parameters:randomized (bool) – whether to go for a random agent, defaults to False
Returns:fake HTTP headers
Return type:dict

Examples:

>>> from pyhelpers.ops import fake_requests_headers

>>> fake_headers_ = fake_requests_headers()
>>> print(fake_headers_)
{'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ch...

>>> fake_headers_ = fake_requests_headers(randomized=True)
>>> print(fake_headers_)
{'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 (KHTML ...

Note

The above fake_headers_ may be different every time we run the examples.

pyrcs.utils.get_last_updated_date(url, parsed=True, as_date_type=False)[source]

Get last update date.

Parameters:
  • url (str) – URL link of a requested web page
  • parsed (bool) – whether to reformat the date, defaults to True
  • as_date_type (bool) – whether to return the date as datetime.date, defaults to False
Returns:

date of when the specified web page was last updated

Return type:

str, datetime.date, None

Examples:

from pyrcs.utils import get_last_updated_date

parsed = True

url = 'http://www.railwaycodes.org.uk/crs/CRSa.shtm'

date_type = False
get_last_updated_date(url, parsed, date_type)
# '<year>-<month>-<day>'

date_type = True
get_last_updated_date(url, parsed, date_type)
# datetime.date(<year>, <month>, <day>)

url = 'http://www.railwaycodes.org.uk/linedatamenu.shtm'
get_last_updated_date(url, parsed, date_type)
# None
pyrcs.utils.get_catalogue(page_url, update=False, confirmation_required=True, json_it=True, verbose=False)[source]

Get the catalogue for a class.

Parameters:
  • page_url (str) – URL of the main page of a code category
  • update (bool) – whether to check on update and proceed to update the package data, defaults to False
  • confirmation_required (bool) – whether to prompt a message for confirmation to proceed, defaults to True
  • json_it (bool) – whether to save the catalogue as a .json file, defaults to True
  • verbose (bool) – whether to print relevant information in console as the function runs, defaults to False
Returns:

catalogue in the form {‘<title>’: ‘<URL>’}

Return type:

dict

Examples:

from pyrcs.utils import get_catalogue

update = False
verbose = True

page_url = 'http://www.railwaycodes.org.uk/elrs/elr0.shtm'
confirmation_required = True
catalogue = get_catalogue(page_url, update, confirmation_required, verbose)

page_url = 'http://www.railwaycodes.org.uk/linedatamenu.shtm'
confirmation_required = False
catalogue = get_catalogue(page_url, update, confirmation_required, verbose)
pyrcs.utils.get_category_menu(menu_url, update=False, confirmation_required=True, json_it=True, verbose=False)[source]

Get a menu of the available classes.

Parameters:
  • menu_url (str) – URL of the menu page
  • update (bool) – whether to check on update and proceed to update the package data, defaults to False
  • confirmation_required (bool) – whether to prompt a message for confirmation to proceed, defaults to True
  • json_it (bool) – whether to save the catalogue as a .json file, defaults to True
  • verbose (bool) – whether to print relevant information in console as the function runs, defaults to False
Returns:

Return type:

dict

Example:

from pyrcs.utils import get_category_menu

update = False
confirmation_required = True
verbose = True

menu_url = 'http://www.railwaycodes.org.uk/linedatamenu.shtm'
cls_menu = get_category_menu(menu_url)

print(cls_menu)
# {'<category name>': {'<title>': '<URL>'}}
pyrcs.utils.get_station_data_catalogue(source_url, source_key, update=False)[source]

Get catalogue of railway station data.

Parameters:
  • source_url (str) – URL to the source web page
  • source_key (str) – key of the returned catalogue (which is a dictionary)
  • update (bool) – whether to check on update and proceed to update the package data, defaults to False
Returns:

catalogue of railway station data

Return type:

dict

pyrcs.utils.get_track_diagrams_items(source_url, source_key, update=False)[source]

Get catalogue of track diagrams.

Parameters:
  • source_url (str) – URL to the source web page
  • source_key (str) – key of the returned catalogue (which is a dictionary)
  • update (bool) – whether to check on update and proceed to update the package data, defaults to False
Returns:

catalogue of railway station data

Return type:

dict


Rectification of location names

fetch_location_names_repl_dict([k, regex, …]) Create a dictionary for rectifying location names.
update_location_name_repl_dict(new_items, regex) Update the location-name replacement dictionary in the package data.
pyrcs.utils.fetch_location_names_repl_dict(k=None, regex=False, as_dataframe=False)[source]

Create a dictionary for rectifying location names.

Parameters:
  • k (str, int, float, bool, None) – key of the created dictionary, defaults to None
  • regex (bool) – whether to create a dictionary for replacement based on regular expressions, defaults to False
  • as_dataframe (bool) – whether to return the created dictionary as a pandas.DataFrame, defaults to False
Returns:

dictionary for rectifying location names

Return type:

dict, pandas.DataFrame

Examples:

from pyrcs.utils import fetch_location_names_repl_dict

k = None
regex = False
as_dataframe = True
fetch_location_names_repl_dict(k, regex, as_dataframe)

regex = True
as_dataframe = False
fetch_location_names_repl_dict(k, regex, as_dataframe)
pyrcs.utils.update_location_name_repl_dict(new_items, regex, verbose=False)[source]

Update the location-name replacement dictionary in the package data.

Parameters:
  • new_items (dict) – new items to replace
  • regex (bool) – whether this update is for regular-expression dictionary
  • verbose (bool) – whether to print relevant information in console as the function runs, defaults to False

Example:

from pyrcs.utils import update_location_name_repl_dict

verbose = True

new_items = {} regex = False update_location_name_repl_dict(new_items, regex, verbose)


Fixers

fix_num_stanox(stanox_code) Fix ‘STANOX’ if it is loaded as numbers.
pyrcs.utils.fix_num_stanox(stanox_code)[source]

Fix ‘STANOX’ if it is loaded as numbers.

Parameters:stanox_code (str, int) – STANOX code
Returns:standard STANOX code
Return type:str

Examples:

stanox_code = 65630
fix_num_stanox(stanox_code)  # '65630'

stanox_code = 2071
fix_num_stanox(stanox_code)  # '02071'

Misc

is_str_float(str_val) Check if a string-type variable can express a float value.
pyrcs.utils.is_str_float(str_val)[source]

Check if a string-type variable can express a float value.

Parameters:str_val (str) – a string-type variable
Returns:whether str_val can express a float value
Return type:bool

Examples:

str_val = ''
is_str_float(str_val)  # False

str_val = 'a'
is_str_float(str_val)  # False

str_val = '1'
is_str_float(str_val)  # True

str_val = '1.1'
is_str_float(str_val)  # True