API Summary

Skeletons

bootstrap_skel_stats(mv3d_pattern[, ...]) Calculate errors for skeleton statistics, as output by the Avizo subsampling script.
find_nodes(input_fname[, node_output, ...]) Calculate and write a data file describing the connectivity of nodes

Surfaces

process_surface_stats_lsm_ysz(pattern[, ...]) Calculate errors for surface statistics, as output by the Avizo subsampling script.

Tortuosity

bootstrap_tort_stats([csv_pattern, ...]) Calculate errors for tortuosity profiles, as output by FIBTortuosity module.

TPB

bootstrap_tpb_stats([inputs_dict, box_size, ...]) Calculate statistics various TPB properties (and their errors) using a random subvolume sampling method.
read_mv3d(filename) Get number of lines and points, as well as the 3d data contained withing
write_mv3d(fname, data[, d, overwrite]) Output path data to an mv3d file, which can be read by Avizo (and other
crop_tpb_data(data, box_start, box_end) Crop TPB data to only contain points within the box defined by the corners
get_bb_from_data(data_list) Infer from the the existing data the extents of the bounding box
bb_volume(bb) Return the volume enclosed by a bounding box.
get_random_subvolume(bb, size) Given a particular bounding box and subvolume size, return two lists with x, y, z coordinates of lower and upper corners of a random subvolume withing the bounding box.
path_length(path_df) Calculate the length along a path.
network_length(data) Calculate the total length of a network that has been imported from
split_paths_in_network(data[, threshold]) Given an array of data, try to detect the paths that are on the edge of the volume and split the single path (containing a sudden long jump) into multiple paths.
animate_cropped_data(data_a, data_i, data_u) Plot a simple animation of the total TPB path network, as well as the
get_bb_lines(bb) Given a bounding box, find the necessary vectors that will allow for
get_box_and_corners(box_start, box_end) Given two opposite corners of a 3D rectangle, find the necessary vectors

Utilities

calculate_errors(df, samples) Calculate the “error bars” of each column in a Pandas dataframe

Full Package API

Skeletons

FIBbootstrap.skeleton.bootstrap_skel_stats(mv3d_pattern, csv_pattern=None, n_bootstrap=100000, volume=None, save_output=False, data_output_fname=None, err_output_fname=None)[source]

Calculate errors for skeleton statistics, as output by the Avizo subsampling script. Operates on many .mv3d skeleton files

Parameters:
  • mv3d_pattern (str) – glob pattern to grab mv3d files to process from output of subvolume Avizo scripts. Based off of these filenames, corresponding .csv files containing the spatial graph statistics will be accessed as well. If the mv3d files are named: YMdA-01_labels.view.LSM.skel.am.subvolSkel.*.mv3d The following pattern will be used to glob for the Spatial graph stats: YMdA-01_labels.view.LSM.skel.am.subvolSkel.*.csv
  • csv_pattern (str or None) – glob pattern to grab csv files to process from Avizo spatial graph output. If None, will attempt a calculation from the mv3d_pattern value as described above
  • n_bootstrap (int) – number of bootstrap samples to use when calculating confidence intervals
  • volume (None or number) – volume of analyzed data cube. If this is given, data will be returned with nodes/volume and edges/volume given, in addition to the absolute values
  • save_output (bool) – switch to control whether or not the output is written directly to a CSV file in the current directory
  • data_output_fname (None or str) – filename to use when saving the data output; if None, an appropriate string will be built from the input pattern
  • err_output_fname (None or str) – filename to use when saving the error output; if None, an appropriate string will be built from the input pattern
Returns:

  • data_df (pandas.DataFrame) – Dataframe with data from subvolume statistic calculations
  • error_df (DataFrame) – Dataframe with low and high errors calculated using n_bootstrap samples

FIBbootstrap.skeleton.find_nodes(input_fname, node_output='nodes.txt', save_output=True, return_type='str')[source]

Calculate and write a data file describing the connectivity of nodes within a network (saved in an .mv3d file from Avizo)

Parameters:
  • input_fname (str) – .mv3d filename to read
  • node_output (str) – name of text file to write to, if saving the output
  • save_output (bool) – switch to control whether a file is written to disk (output format that matches the old findNodes.sh script)
  • return_type (str) – passed to numpy.ndarray.astype() to determine how the output should be formatted upon return. Default is as a string, so no information contained in the original .mv3d file is modified, but oftentimes 'float32' would be more useful for calculating statistics
Returns:

  • data (ndarray) – Contains the output data in a numpy array with columns of:

    \(k_i\) x position y position z position thickness
  • num_edges (float) – Number of edges (E) in skeleton

  • num_nodes (float) – Number of nodes (N) in skeleton

  • mean_k (float) – Average node connectivity (<k>)

Surfaces

FIBbootstrap.surface.process_surface_stats_lsm_ysz(pattern, n_bootstrap=100000, save_output=False, output_fname='subvolume_errors.csv')[source]

Calculate errors for surface statistics, as output by the Avizo subsampling script. This version operates on LSM-YSZ material names

Parameters:
  • pattern (str) – glob pattern to grab csv files to process from output of subvolume Avizo scripts
  • n_bootstrap (int) – number of bootstrap samples to use when calculating confidence intervals
  • save_output (bool) – switch to control whether or not the output is written directly to a CSV file in the current directory
  • output_fname (str) – filename to use when saving the output
Returns:

error_df – Dataframe with low and high errors calculated using n_bootstrap samples

Return type:

DataFrame

Tortuosity

FIBbootstrap.tortuosity.bootstrap_tort_stats(csv_pattern=None, n_bootstrap=100000, thresh=0.75, save_output=False, data_output_fname=None, err_output_fname=None)[source]

Calculate errors for tortuosity profiles, as output by FIBTortuosity module. Operates on many .csv files, each with a single tortuosity profile for a phase and direction (i.e. LSM-x, Pore-y, YSZ-z, etc.)

Parameters:
  • csv_pattern (str) – glob pattern to grab csv files to process from output of tortuosity calculations. Usually, this will be something like: os.path.join(<path holding files>, "*.csv")
  • n_bootstrap (int) – number of bootstrap samples to use when calculating confidence intervals
  • thresh (float) – value between 0 and 1, defining from what portion of the profiles to calculate the errors. For example, for the default value of 0.75, the error in the tortuosity for the last 25% of euclidean distance values will be calculated. A thresh value of 0.0 would calculate the error on the whole profile. Usually, only a small value towards the end of the dataset is desired, so one can analyze how much the data was changing towards the end of the profile.
  • save_output (bool) – switch to control whether or not the bootstrap data and error output is written directly to a CSV file in the current directory
  • data_output_fname (None or str) – filename to use when saving the data output; if None, an appropriate string will be built from the input pattern
  • err_output_fname (None or str) – filename to use when saving the error output; if None, an appropriate string will be built from the input pattern
Returns:

  • data_df (pandas.DataFrame) – Dataframe with data from subvolume statistic calculations
  • error_df (DataFrame) – Dataframe with low and high errors calculated using n_bootstrap samples

TPB

FIBbootstrap.tpb.bootstrap_tpb_stats(inputs_dict=None, box_size=4000, n_volumes=500, n_bootstrap=100000, save_output=False, data_output_fname=None, err_output_fname=None, output_avg=False)[source]

Calculate statistics various TPB properties (and their errors) using a random subvolume sampling method. Total TPB length (for the subvolume), TPB density, and average length of a TPB path will be calculated.

Parameters:
  • inputs_dict (dict) – dictionary of values describing the input data. Keys should be labels for a particular set of TPB paths, while the values should be filenames for mv3d skeleton files of the TPB paths.
  • box_size (float) – length of edge of cube used to define the subsampled volumes. The total volume sampled in each trial will be box_size**3, and the boxes will be selected randomly throughout the volume of data
  • n_volumes (int) – number of subvolumes to sample from the volume (usually ~500 or so)
  • n_bootstrap (int) – number of bootstrap samples to use when calculating confidence intervals
  • save_output (bool) – switch to control whether or not the bootstrap data and error output is written directly to a CSV file in the current directory
  • data_output_fname (str) – filename to use when saving the data output
  • err_output_fname (str) – filename to use when saving the error output
  • output_avg (bool) – switch to control whether “Avg TPB path length” will be calculated
Returns:

  • data_df (DataFrame) – Dataframe with data from subvolume statistic calculations
  • error_df (DataFrame) – Dataframe with low and high errors (and std. dev.) calculated using n_bootstrap samples

Example

>>> from FIBbootstrap.tpb import bootstrap_tpb_stats
>>> inputs_dict = {
...     'active':'smoothActive.savg.mv3d',
...     'inactive':'smoothInactive.savg.mv3d',
...     'unknown':'smoothUnknown.savg.mv3d'}
>>> data_out, \
... error_out = bootstrap_tpb_stats(inputs_dict,
...                                 n_volumes=500,
...                                 box_size=4000,
...                                 save_output=False,
...                                 data_output_fname='data_N500_s4000.csv',
...                                 err_output_fname='errors_N500_s4000.csv',
...                                 output_avg=False)
FIBbootstrap.tpb.read_mv3d(filename)[source]

Get number of lines and points, as well as the 3d data contained withing an MV3D network file

Parameters:filename (str) – Name of .mv3d file to open
Returns:
  • data (ndarray) – (N x 4) numpy array containing the index, x, y, and z coordinates of each point within the network (the last value, d is discarded)
  • num_lines (int) – number of lines contained in the network (read from line 2 of the file)
  • num_points (int) – number of points contained in the network (read from line 3 of the file)
FIBbootstrap.tpb.write_mv3d(fname, data, d=0, overwrite=True)[source]

Output path data to an mv3d file, which can be read by Avizo (and other software)

Parameters:
  • fname (str) – Filename to which to write; will be overwritten if it exists (by default)
  • data (ndarray) – array containing network (spatial graph) data in the same format as output by read_mv3d() or crop_tpb_data()
  • d (ndarray) – the value to be written in the ‘thickness’ column of the mv3d file. Can be used to tag files with a scalar value. If a single number is given instead of an array, that number will be used for every point. If a numpy array (the same length as the data), the values will be specific to each point. Standard pandas/numpy broadcasting rules apply
  • overwrite (bool) – switch to control whether an existing file will be clobbered if it already exists
FIBbootstrap.tpb.crop_tpb_data(data, box_start, box_end)[source]

Crop TPB data to only contain points within the box defined by the corners box_start and box_end

Parameters:
  • data (ndarray) – (N x 4) numpy array with TPB data (as loaded by read_mv3d()), including the first column containing the index
  • box_start (list or ndarray) – x, y, z coordinates of lower bound corner to crop inside of
  • box_end (list or ndarray) – x, y, z coordinates of upper bound corner to crop inside of
Returns:

cropped_data – copy of the original data, containing only the points inside of the crop box

Return type:

ndarray

FIBbootstrap.tpb.get_bb_from_data(data_list)[source]

Infer from the the existing data the extents of the bounding box

Parameters:data_list (list) – list or array of numpy arrays with data (in the format returned by read_mv3d() or crop_tpb_data(). Expects 4 columns representing [‘id’, ‘x’, ‘y’, ‘z’].
Returns:
  • min_bb (tuple) – list of x, y, z values containing the smallest coordinates in each dimension found in the data_list
  • max_bb (tuple) – list of x, y, z values containing the largest coordinates in each dimension found in the data_list
FIBbootstrap.tpb.bb_volume(bb)[source]

Return the volume enclosed by a bounding box.

Parameters:bb (tuple of length 2) – tuple of length two, each term should be an iterable of length three with the minimum (position 0) and maximum (position 1) bounding box coordinate in each dimension
Returns:volume – volume enclosed by the bounding box
Return type:float
FIBbootstrap.tpb.get_random_subvolume(bb, size)[source]

Given a particular bounding box and subvolume size, return two lists with x, y, z coordinates of lower and upper corners of a random subvolume withing the bounding box. Returned subvolume will be completely enclosed within bb (i.e. the maximum position for the lower bound of the cube x value will be bb[0] - size).

Parameters:
  • bb (tuple) – tuple of length two, each term should be an iterable of length three with the minimum (position 0) and maximum (position 1) bounding box coordinate in each dimension from which to take a random volume; returned volume will be in the range [min_x, max_x], [min_y, max_y], and [min_z, max_z]
  • size (number) – size of cube to return (single edge length, so total volume enclosed will be size**3)
Returns:

  • box_start (list) – lower bound corner of the subvolume box (x, y, and z)
  • box_end (list) – upper bound corner of the subvolume box (x, y, and z

FIBbootstrap.tpb.path_length(path_df)[source]

Calculate the length along a path.

Parameters:path_df (DataFrame) – path_df should have 4 columns, ‘id’, ‘x’, ‘y’, ‘z’. Using scipy.spatial.distance.pdist(), this method will calculate the sum of the distances between successive rows of (x, y, z) coordinates in the dataframe
Returns:length – total sum of path defined by successive points in path_df
Return type:float
FIBbootstrap.tpb.network_length(data)[source]

Calculate the total length of a network that has been imported from the mv3d format

Parameters:data (ndarray) – network data, in the format returned by read_mv3d() or crop_tpb_data()
Returns:
  • lengths (ndarray) – lengths of each individual segment within the network
  • tot_length (float) – total length of network
FIBbootstrap.tpb.split_paths_in_network(data, threshold=0.2)[source]

Given an array of data, try to detect the paths that are on the edge of the volume and split the single path (containing a sudden long jump) into multiple paths.

Parameters:
  • data (ndarray) – array of data, like that returned by read_mv3d() or crop_tpb_data()
  • threshold (float) – weighting threshold in the range [0, 1] to help determine what is an outlier. All the data in each path is fit by a robust linear model, so any length values that are significantly different should have a weight << 1.0. Set this value higher to find more outliers, and thus split more paths. Set it lower to be more conservative.
Returns:

split_data – copy of the original data, but with paths split at points that caused particularly large jumps from point to point

Return type:

ndarray

FIBbootstrap.tpb.animate_cropped_data(data_a, data_i, data_u, size=4000, subsample=1, bb=None)[source]

Plot a simple animation of the total TPB path network, as well as the network contained within a randomly cropped volume

Parameters:
  • data_a (ndarray) – network data, in the format returned by read_mv3d() or crop_tpb_data(); will be plotted in green (active)
  • data_i (ndarray) – network data, in the format returned by read_mv3d() or crop_tpb_data(); will be plotted in red (inactive)
  • data_u (ndarray) – network data, in the format returned by read_mv3d() or crop_tpb_data(); will be plotted in yellow (unknown)
  • size (number) – size of cube to return (single edge length, so total volume enclosed will be size**3)
  • subsample (int) – factor by which to subsample the total tpb network. If > 1, it will speed up the plotting (may be helpful on lower-powered CPUs)
  • bb (None or tuple of length 2) – tuple of length two, each term should be an iterable of length three with the minimum (position 0) and maximum (position 1) bounding box coordinate in each dimension from which to take a random volume; returned volume will be in the range [min_x, max_x], [min_y, max_y], and [min_z, max_z] If None, the bounding box will be inferred from the supplied data
FIBbootstrap.tpb.get_bb_lines(bb)[source]

Given a bounding box, find the necessary vectors that will allow for plotting of the edges of the bounding box using MayaVi

Parameters:bb (tuple of length 2) – tuple of length two, each term should be an iterable of length three with the minimum (position 0) and maximum (position 1) bounding box coordinate in each dimension from which to take a random volume; returned volume will be in the range
Returns:
FIBbootstrap.tpb.get_box_and_corners(box_start, box_end)[source]

Given two opposite corners of a 3D rectangle, find the necessary vectors that will allow for plotting of the edges of the rectangle and points at each of the corners using MayaVi

Parameters:
  • box_start (iterable) – lower bounding corner of the rectangle (i.e. [x, y, z])
  • box_end (iterable) – upper bounding corner of the rectangle (i.e. [x, y, z])
Returns:

  • x (ndarray) – array of x positions for mayavi.mlab.plot3d() for plotting edges of 3D rectangle
  • y (ndarray) – array of y positions for plot3d() for plotting edges of 3D rectangle
  • z (ndarray) – array of z positions for plot3d() for plotting edges of 3D rectangle
  • x_p (ndarray) – array of x positions for plot3d() for plotting corners of 3D rectangle
  • y_p (ndarray) – array of y positions for plot3d() for plotting corners of 3D rectangle
  • z_p (ndarray) – array of z positions for plot3d() for plotting corners of 3D rectangle

Utilities

FIBbootstrap.utils.calculate_errors(df, samples)[source]

Calculate the “error bars” of each column in a Pandas dataframe

Parameters:
  • df (DataFrame) – dataframe on which to calculate
  • samples (int) – number of bootstrap samples to use
Returns:

result – dataframe with - and + error values (and mean) for each column in df

Return type:

DataFrame