general module

File : general.py Author : Rolf Verberg <rolfverberg AT gmail dot com> Description: A collection of general modules

almost_equal(a, b, sig_figs)

Check if equal to within a certain number of significant digits.

assert_no_duplicate_attr_in_list_of_objs(_list, attr, raise_error=False)

Assert that there are no duplicate attributes in a list of objects.

assert_no_duplicate_key_in_list_of_dicts(_list, key, raise_error=False)

Assert that there are no duplicate keys in a list of dictionaries.

assert_no_duplicates_in_list_of_dicts(_list, raise_error=False)

Assert that there are no duplicates in a list of dictionaries.

baseline_arPLS(y, mask=None, w=None, tol=1e-08, lam=1000000.0, max_iter=20, full_output=False)

Returns the smoothed baseline estimate of a spectrum.

Based on S.-J. Baek, A. Park, Y.-J Ahn, and J. Choo, “Baseline correction using asymmetrically reweighted penalized least squares smoothing”, Analyst, 2015,140, 250-257

Parameters:
  • y (array-like) – The spectrum.

  • mask (array-like, optional) – A mask to apply to the spectrum before baseline construction.

  • w (numpy.array, optional) – The weights (allows restart for additional ieterations).

  • tol (float, optional) – The convergence tolerence, defaults to 1.e-8.

  • lam (float, optional) – The &lambda (smoothness) parameter (the balance between the residual of the data and the baseline and the smoothness of the baseline). The suggested range is between 100 and 10^8, defaults to 10^6.

  • max_iter (int, optional) – The maximum number of iterations, defaults to 20.

  • full_output (bool, optional) – Whether or not to also output the baseline corrected spectrum, the number of iterations and error in the returned result, defaults to False.

Returns:

The smoothed baseline, with optionally the baseline corrected spectrum, the weights, the number of iterations and the error in the returned result.

Return type:

numpy.array [, numpy.array, int, float]

depth_list(_list)

Return the depth of a list.

depth_tuple(_tuple)

Return the depth of a tuple.

fig_to_iobuf(fig, fileformat=None)

Return an in-memory object as a byte stream represention of a Matplotlib figure.

Parameters:
  • fig (matplotlib.figure.Figure) – Matplotlib figure object.

  • fileformat (str, optional) – Valid Matplotlib saved figure file format, defaults to ‘png’.

Returns:

Byte stream representation of the Matplotlib figure and the associated file format.

Return type:

tuple[_io.BytesIO, str]

file_exists_and_readable(f)

Check if a file exists and is readable.

get_consecutive_int_range(a)

Return a list of pairs of integers marking consecutive ranges of integers.

get_trailing_int(string)

Get the trailing integer in a string.

getfloat_attr(obj, attr, length=11)

Format an attribute of an object for printing.

gformat(val, length=11)
Format a number with ‘%g’-like format, while:
  • the length of the output string will be of the requested length

  • positive numbers will have a leading blank

  • the precision will be as high as possible

  • trailing zeros will not be trimmed

illegal_combination(value1, name1, value2, name2, location=None, raise_error=False, log=True)

Print illegal combination message and/or raise error.

illegal_value(value, name, location=None, raise_error=False, log=True)

Print illegal value message and/or raise error.

index_nearest(a, value)

Return index of nearest array value.

index_nearest_down(a, value)

Return index of nearest array value, rounded down.

index_nearest_up(a, value)

Return index of nearest array value, rounded up.

input_int(s=None, ge=None, gt=None, le=None, lt=None, default=None, inset=None, raise_error=False, log=True)

Interactively prompt the user to enter an integer.

input_int_list(s=None, num_max=None, ge=None, le=None, split_on_dash=True, remove_duplicates=True, sort=True, raise_error=False, log=True)

Prompt the user to input a list of integers and split the entered string on any combination of commas, whitespaces, or dashes (when split_on_dash is True). e.g: ‘1 3,5-8 , 12 ‘ -> [1, 3, 5, 6, 7, 8, 12]

Parameters:
  • s (str, optional) – Interactive user prompt, defaults to None.

  • num_max (int, optional) – Maximum number of inputs in list.

  • ge (int, optional) – Minimum value of inputs in list.

  • le (int, optional) – Minimum value of inputs in list.

  • split_on_dash (bool, optional) – Allow dashes in input string, defaults to True.

  • remove_duplicates (bool, optional) – Removes duplicates (may also change the order), defaults to True.

  • sort (bool, optional) – Sort in ascending order, defaults to True.

  • raise_error (bool, optional) – Raise an exception upon any error, defaults to False.

  • log (bool, optional) – Print an error message upon any error, defaults to True.

Returns:

Input list or none upon an illegal input.

Return type:

list

input_menu(items, default=None, header=None)

Interactively prompt the user to select from a menu.

input_num(s=None, ge=None, gt=None, le=None, lt=None, default=None, raise_error=False, log=True)

Interactively prompt the user to enter a number.

input_num_list(s=None, num_max=None, ge=None, le=None, remove_duplicates=True, sort=True, raise_error=False, log=True)

Prompt the user to input a list of numbers and split the entered string on any combination of commas or whitespaces. e.g: ‘1.0, 3, 5.8, 12 ‘ -> [1.0, 3.0, 5.8, 12.0]

Parameters:
  • s (str, optional) – Interactive user prompt.

  • num_max (int, optional) – Maximum number of inputs in list.

  • ge (float, optional) – Minimum value of inputs in list.

  • le (float, optional) – Minimum value of inputs in list.

  • remove_duplicates (bool, optional) – Removes duplicates (may also change the order), defaults to True.

  • sort (bool, optional) – Sort in ascending order, defaults to True.

  • raise_error (bool, optional) – Raise an exception upon any error, defaults to False.

  • log (bool, optional) – Print an error message upon any error, defaults to True.

Returns:

Input list or none upon an illegal input.

Return type:

list

input_yesno(s=None, default=None)

Interactively prompt the user to enter a y/n question.

is_dict_nums(d, raise_error=False, log=True)

Value is a dictionary with single number values.

is_dict_series(t_or_l, raise_error=False, log=True)

Value is a tuple or list of dictionaries.

is_dict_strings(d, raise_error=False, log=True)

Value is a dictionary with single string values.

is_index(v, ge=0, lt=None, raise_error=False, log=True)

Value is an array index in range ge <= v < lt. NOTE lt IS NOT included!

is_index_range(v, ge=0, le=None, lt=None, raise_error=False, log=True)

Value is an array index range in range ge <= v[0] <= v[1] <= le or ge <= v[0] <= v[1] < lt. NOTE le IS included!

is_int(v, ge=None, gt=None, le=None, lt=None, raise_error=False, log=True)

Value is an integer in range ge <= v <= le or gt < v < lt or some combination.

Returns:

True if yes or False is no.

Return type:

bool

is_int_pair(v, ge=None, gt=None, le=None, lt=None, raise_error=False, log=True)

Value is an integer pair, each in range ge <= v[i] <= le or gt < v[i] < lt or ge[i] <= v[i] <= le[i] or gt[i] < v[i] < lt[i] or some combination.

Returns:

True if yes or False is no.

Return type:

bool

is_int_series(t_or_l, ge=None, gt=None, le=None, lt=None, raise_error=False, log=True)

Value is a tuple or list of integers, each in range ge <= l[i] <= le or gt < l[i] < lt or some combination.

is_num(v, ge=None, gt=None, le=None, lt=None, raise_error=False, log=True)

Value is a number in range ge <= v <= le or gt < v < lt or some combination.

Returns:

True if yes or False is no.

Return type:

bool

is_num_pair(v, ge=None, gt=None, le=None, lt=None, raise_error=False, log=True)

Value is a number pair, each in range ge <= v[i] <= le or gt < v[i] < lt or ge[i] <= v[i] <= le[i] or gt[i] < v[i] < lt[i] or some combination.

Returns:

True if yes or False is no.

Return type:

bool

is_num_series(t_or_l, ge=None, gt=None, le=None, lt=None, raise_error=False, log=True)

Value is a tuple or list of numbers, each in range ge <= l[i] <= le or gt < l[i] < lt or some combination.

is_str_series(t_or_l, raise_error=False, log=True)

Value is a tuple or list of strings.

list_to_string(a)

Return a list of pairs of integers marking consecutive ranges of integers in string notation.

not_zero(value)

Return value with a minimal absolute size of tiny, preserving the sign.

nxcopy(nxobject, exclude_nxpaths=None, nxpath_prefix=None, nxpathabs_prefix=None, nxpath_copy_abspath=None)

Function that returns a copy of a nexus object, optionally exluding certain child items.

Parameters:
  • nxobject (nexusformat.nexus.NXobject) – The input nexus object to “copy”.

  • exlude_nxpaths – A list of relative paths to child nexus objects that should be excluded from the returned “copy”.

  • nxpath_prefix (str) – For use in recursive calls from inside this function only.

  • nxpathabs_prefix (str) – For use in recursive calls from inside this function only.

  • nxpath_copy_abspath (str) – For use in recursive calls from inside this function only.

Returns:

Copy of the input nxobject with some children optionally exluded.

Return type:

nexusformat.nexus.NXobject

quick_imshow(a, title=None, row_label='row index', column_label='column index', path=None, name=None, show_fig=True, save_fig=False, return_fig=False, block=None, extent=None, show_grid=False, grid_color='w', grid_linewidth=1, **kwargs)

Display and or save a 2D image and or return an in-memory object as a byte stream represention.

quick_plot(*args, xerr=None, yerr=None, vlines=None, title=None, xlim=None, ylim=None, xlabel=None, ylabel=None, legend=None, path=None, name=None, show_grid=False, save_fig=False, save_only=False, block=False, **kwargs)

Display a 2D line plot.

range_string_ge_gt_le_lt(ge=None, gt=None, le=None, lt=None)

Return a range string representation matching the ge, gt, le, lt qualifiers. Does not validate the inputs, do that as needed before calling.

rolling_average(y, x=None, dtype=None, start=0, end=None, width=None, stride=None, num=None, average=True, mode='valid', use_convolve=None)

Returns the rolling sum or average of an array over the last dimension.

round_to_n(x, n=1)

Round to a specific number of sig figs.

round_up_to_n(x, n=1)

Round up to a specific number of sig figs.

save_iobuf_fig(buf, filename, force_overwrite=False)

Save a byte stream represention of a Matplotlib figure to file.

Parameters:
  • buf (_io.BytesIO) – Byte stream representation of the Matplotlib figure.

  • filename (str) – Filename (with a valid extension).

  • force_overwrite (bool, optional) – Flag to allow filename to be overwritten if it already exists, defaults to False.

Raises:

RuntimeError – If a file already exists and force_overwrite is False.

select_image_indices(a, axis, b=None, preselected_indices=None, axis_index_offset=0, min_range=None, min_num_indices=2, max_num_indices=2, title=None, title_a=None, title_b=None, row_label='row index', column_label='column index', interactive=True, return_buf=False)
Display a 2D image and have the user select a set of image

indices in either row or column direction.

Parameters:
  • a (numpy.ndarray) – Two-dimensional image data array for which a region of interest will be selected.

  • axis (int) – The selection direction (0: row, 1: column)

  • b (numpy.ndarray, optional) – A secondary two-dimensional image data array for which a shared region of interest will be selected.

  • preselected_indices (tuple(int), list(int), optional) – Preselected image indices.

  • axis_index_offset (int, optional) – Offset in axis index range and preselected indices, defaults to 0.

  • min_range (int, optional) – The minimal range spanned by the selected indices.

  • min_num_indices (int, optional) – The minimum number of selected indices.

  • max_num_indices (int, optional) – The maximum number of selected indices.

  • title (str, optional) – Title for the displayed figure.

  • title_a (str, optional) – Title for the image of a.

  • title_b (str, optional) – Title for the image of b.

  • row_label (str, optional) – Label for the y-axis of the displayed figure, defaults to row index.

  • column_label (str, optional) – Label for the x-axis of the displayed figure, defaults to column index.

  • interactive (bool, optional) – Show the plot and allow user interactions with the matplotlib figure, defaults to True.

  • return_buf (bool, optional) – Return an in-memory object as a byte stream represention of the Matplotlib figure instead of the matplotlib figure, defaults to False.

Returns:

The selected region of interest as array indices and a matplotlib figure.

Return type:

Union[matplotlib.figure.Figure, io.BytesIO], tuple(int, int, int, int)

select_mask_1d(y, x=None, preselected_index_ranges=None, preselected_mask=None, title=None, xlabel=None, ylabel=None, min_num_index_ranges=None, max_num_index_ranges=None, interactive=True, filename=None, return_buf=False)

Display a lineplot and have the user select a mask.

Parameters:
  • y (numpy.ndarray) – One-dimensional data array for which a mask will be constructed.

  • x (numpy.ndarray, optional) – x-coordinates of the reference data.

  • preselected_index_ranges (Union(list[tuple(int, int)], list[list[int]], list[tuple(float, float)], list[list[float]]), optional) – List of preselected index ranges to mask (bounds are inclusive).

  • preselected_mask (numpy.ndarray, optional) – Preselected boolean mask array.

  • title (str, optional) – Title for the displayed figure.

  • xlabel (str, optional) – Label for the x-axis of the displayed figure.

  • ylabel (str, optional) – Label for the y-axis of the displayed figure.

  • min_num_index_ranges (int, optional) – The minimum number of selected index ranges.

  • max_num_index_ranges (int, optional) – The maximum number of selected index ranges.

  • interactive (bool, optional) – Show the plot and allow user interactions with the matplotlib figure, defaults to True.

  • filename (str, optional) – Save a .png of the plot to filename, defaults to None, in which case the plot is not saved.

  • return_buf (bool, optional) – Return an in-memory object as a byte stream represention of the Matplotlib figure, defaults to False.

Returns:

A byte stream represention of the Matplotlib figure if return_buf is True (None otherwise), a boolean mask array, and the list of selected index ranges.

Return type:

Union[io.BytesIO, None], numpy.ndarray, list[list[int, int]]

select_roi_1d(y, x=None, preselected_roi=None, title=None, xlabel=None, ylabel=None, interactive=True, filename=None, return_buf=False)

Display a 2D plot and have the user select a single region of interest.

Parameters:
  • y (numpy.ndarray) – One-dimensional data array for which a for which a region of interest will be selected.

  • x (numpy.ndarray, optional) – x-coordinates of the data

  • preselected_roi (tuple(int, int), optional) – Preselected region of interest.

  • title (str, optional) – Title for the displayed figure.

  • xlabel (str, optional) – Label for the x-axis of the displayed figure.

  • ylabel (str, optional) – Label for the y-axis of the displayed figure.

  • interactive (bool, optional) – Show the plot and allow user interactions with the matplotlib figure, defaults to True.

  • filename (str, optional) – Save a .png of the plot to filename, defaults to None, in which case the plot is not saved.

  • return_buf (bool, optional) – Return an in-memory object as a byte stream represention of the Matplotlib figure, defaults to False.

Returns:

A byte stream represention of the Matplotlib figure if return_buf is True (None otherwise), and the selected region of interest.

Return type:

Union[io.BytesIO, None], tuple(int, int)

select_roi_2d(a, preselected_roi=None, title=None, title_a=None, row_label='row index', column_label='column index', interactive=True, filename=None, return_buf=False)
Display a 2D image and have the user select a single rectangular

region of interest.

Parameters:
  • a (numpy.ndarray) – Two-dimensional image data array for which a region of interest will be selected.

  • preselected_roi (tuple(int, int, int, int), optional) – Preselected region of interest.

  • title (str, optional) – Title for the displayed figure.

  • title_a (str, optional) – Title for the image of a.

  • row_label (str, optional) – Label for the y-axis of the displayed figure, defaults to row index.

  • column_label (str, optional) – Label for the x-axis of the displayed figure, defaults to column index.

  • interactive (bool, optional) – Show the plot and allow user interactions with the matplotlib figure, defaults to True.

  • filename (str, optional) – Save a .png of the plot to filename, defaults to None, in which case the plot is not saved.

  • return_buf (bool, optional) – Return an in-memory object as a byte stream represention of the Matplotlib figure, defaults to False.

Returns:

A byte stream represention of the Matplotlib figure if return_buf is True (None otherwise), and the selected region of interest.

Return type:

Union[io.BytesIO, None], tuple(int, int, int, int)

string_to_list(s, split_on_dash=True, remove_duplicates=True, sort=True, raise_error=False)

Return a list of numbers by splitting/expanding a string on any combination of commas, whitespaces, or dashes (when split_on_dash=True). e.g: ‘1, 3, 5-8, 12 ‘ -> [1, 3, 5, 6, 7, 8, 12]

Parameters:
  • s (str) – Input string.

  • split_on_dash (bool, optional) – Allow dashes in input string, defaults to True.

  • remove_duplicates (bool, optional) – Removes duplicates (may also change the order), defaults to True.

  • sort (bool, optional) – Sort in ascending order, defaults to True.

  • raise_error (bool, optional) – Raise an exception upon any error, defaults to False.

Returns:

Input list or none upon an illegal input.

Return type:

list

test_ge_gt_le_lt(ge, gt, le, lt, func, location=None, raise_error=False, log=True)

Check individual and mutual validity of ge, gt, le, lt qualifiers.

Parameters:

func (callable: is_int, is_num) – Test for integers or numbers.

Returns:

True upon success or False when mutually exlusive.

Return type:

bool

trunc_to_n(x, n=1)

Truncate to a specific number of sig figs.

unwrap_tuple(_tuple)

Unwrap a tuple.