API Summary¶

Skeletons¶

`bootstrap_skel_stats`(mv3d_pattern[, ...])	Calculate errors for skeleton statistics, as output by the Avizo subsampling script.
`find_nodes`(input_fname[, node_output, ...])	Calculate and write a data file describing the connectivity of nodes

Surfaces¶

process_surface_stats_lsm_ysz(pattern[, ...]) Calculate errors for surface statistics, as output by the Avizo subsampling script.

Tortuosity¶

bootstrap_tort_stats([csv_pattern, ...]) Calculate errors for tortuosity profiles, as output by FIBTortuosity module.

TPB¶

`bootstrap_tpb_stats`([inputs_dict, box_size, ...])	Calculate statistics various TPB properties (and their errors) using a random subvolume sampling method.
`read_mv3d`(filename)	Get number of lines and points, as well as the 3d data contained withing
`write_mv3d`(fname, data[, d, overwrite])	Output path data to an mv3d file, which can be read by Avizo (and other
`crop_tpb_data`(data, box_start, box_end)	Crop TPB data to only contain points within the box defined by the corners
`get_bb_from_data`(data_list)	Infer from the the existing data the extents of the bounding box
`bb_volume`(bb)	Return the volume enclosed by a bounding box.
`get_random_subvolume`(bb, size)	Given a particular bounding box and subvolume size, return two lists with x, y, z coordinates of lower and upper corners of a random subvolume withing the bounding box.
`path_length`(path_df)	Calculate the length along a path.
`network_length`(data)	Calculate the total length of a network that has been imported from
`split_paths_in_network`(data[, threshold])	Given an array of data, try to detect the paths that are on the edge of the volume and split the single path (containing a sudden long jump) into multiple paths.
`animate_cropped_data`(data_a, data_i, data_u)	Plot a simple animation of the total TPB path network, as well as the
`get_bb_lines`(bb)	Given a bounding box, find the necessary vectors that will allow for
`get_box_and_corners`(box_start, box_end)	Given two opposite corners of a 3D rectangle, find the necessary vectors

Utilities¶

calculate_errors(df, samples) Calculate the “error bars” of each column in a Pandas dataframe

Full Package API¶

Skeletons¶

FIBbootstrap.skeleton.bootstrap_skel_stats(mv3d_pattern, csv_pattern=None, n_bootstrap=100000, volume=None, save_output=False, data_output_fname=None, err_output_fname=None)[source]¶

Calculate errors for skeleton statistics, as output by the Avizo subsampling script. Operates on many .mv3d skeleton files

Parameters:

mv3d_pattern (str) – glob pattern to grab mv3d files to process from output of subvolume Avizo scripts. Based off of these filenames, corresponding .csv files containing the spatial graph statistics will be accessed as well. If the mv3d files are named: YMdA-01_labels.view.LSM.skel.am.subvolSkel.*.mv3d The following pattern will be used to glob for the Spatial graph stats: YMdA-01_labels.view.LSM.skel.am.subvolSkel.*.csv
csv_pattern (str or None) – glob pattern to grab csv files to process from Avizo spatial graph output. If None, will attempt a calculation from the mv3d_pattern value as described above
n_bootstrap (int) – number of bootstrap samples to use when calculating confidence intervals
volume (None or number) – volume of analyzed data cube. If this is given, data will be returned with nodes/volume and edges/volume given, in addition to the absolute values
save_output (bool) – switch to control whether or not the output is written directly to a CSV file in the current directory
data_output_fname (None or str) – filename to use when saving the data output; if None, an appropriate string will be built from the input pattern
err_output_fname (None or str) – filename to use when saving the error output; if None, an appropriate string will be built from the input pattern

Returns:

data_df (pandas.DataFrame) – Dataframe with data from subvolume statistic calculations
error_df (DataFrame) – Dataframe with low and high errors calculated using n_bootstrap samples

FIBbootstrap.skeleton.find_nodes(input_fname, node_output='nodes.txt', save_output=True, return_type='str')[source]¶

Calculate and write a data file describing the connectivity of nodes within a network (saved in an .mv3d file from Avizo)

Parameters:

input_fname (str) – .mv3d filename to read
node_output (str) – name of text file to write to, if saving the output
save_output (bool) – switch to control whether a file is written to disk (output format that matches the old findNodes.sh script)
return_type (str) – passed to numpy.ndarray.astype() to determine how the output should be formatted upon return. Default is as a string, so no information contained in the original .mv3d file is modified, but oftentimes 'float32' would be more useful for calculating statistics

Returns:

data (ndarray) – Contains the output data in a numpy array with columns of:

\(k_i\) x position y position z position thickness
num_edges (float) – Number of edges (E) in skeleton
num_nodes (float) – Number of nodes (N) in skeleton
mean_k (float) – Average node connectivity (<k>)

Surfaces¶

FIBbootstrap.surface.process_surface_stats_lsm_ysz(pattern, n_bootstrap=100000, save_output=False, output_fname='subvolume_errors.csv')[source]¶

Calculate errors for surface statistics, as output by the Avizo subsampling script. This version operates on LSM-YSZ material names

Parameters:	pattern (str) – glob pattern to grab csv files to process from output of subvolume Avizo scripts n_bootstrap (int) – number of bootstrap samples to use when calculating confidence intervals save_output (bool) – switch to control whether or not the output is written directly to a CSV file in the current directory output_fname (str) – filename to use when saving the output
Returns:	error_df – Dataframe with low and high errors calculated using n_bootstrap samples
Return type:	DataFrame

Tortuosity¶

FIBbootstrap.tortuosity.bootstrap_tort_stats(csv_pattern=None, n_bootstrap=100000, thresh=0.75, save_output=False, data_output_fname=None, err_output_fname=None)[source]¶

Calculate errors for tortuosity profiles, as output by FIBTortuosity module. Operates on many .csv files, each with a single tortuosity profile for a phase and direction (i.e. LSM-x, Pore-y, YSZ-z, etc.)

Parameters:

csv_pattern (str) – glob pattern to grab csv files to process from output of tortuosity calculations. Usually, this will be something like: os.path.join(<path holding files>, "*.csv")
n_bootstrap (int) – number of bootstrap samples to use when calculating confidence intervals
thresh (float) – value between 0 and 1, defining from what portion of the profiles to calculate the errors. For example, for the default value of 0.75, the error in the tortuosity for the last 25% of euclidean distance values will be calculated. A thresh value of 0.0 would calculate the error on the whole profile. Usually, only a small value towards the end of the dataset is desired, so one can analyze how much the data was changing towards the end of the profile.
save_output (bool) – switch to control whether or not the bootstrap data and error output is written directly to a CSV file in the current directory
data_output_fname (None or str) – filename to use when saving the data output; if None, an appropriate string will be built from the input pattern
err_output_fname (None or str) – filename to use when saving the error output; if None, an appropriate string will be built from the input pattern

Returns:

data_df (pandas.DataFrame) – Dataframe with data from subvolume statistic calculations
error_df (DataFrame) – Dataframe with low and high errors calculated using n_bootstrap samples

TPB¶

FIBbootstrap.tpb.bootstrap_tpb_stats(inputs_dict=None, box_size=4000, n_volumes=500, n_bootstrap=100000, save_output=False, data_output_fname=None, err_output_fname=None, output_avg=False)[source]¶

Calculate statistics various TPB properties (and their errors) using a random subvolume sampling method. Total TPB length (for the subvolume), TPB density, and average length of a TPB path will be calculated.

Parameters:

inputs_dict (dict) – dictionary of values describing the input data. Keys should be labels for a particular set of TPB paths, while the values should be filenames for mv3d skeleton files of the TPB paths.
box_size (float) – length of edge of cube used to define the subsampled volumes. The total volume sampled in each trial will be box_size**3, and the boxes will be selected randomly throughout the volume of data
n_volumes (int) – number of subvolumes to sample from the volume (usually ~500 or so)
n_bootstrap (int) – number of bootstrap samples to use when calculating confidence intervals
save_output (bool) – switch to control whether or not the bootstrap data and error output is written directly to a CSV file in the current directory
data_output_fname (str) – filename to use when saving the data output
err_output_fname (str) – filename to use when saving the error output
output_avg (bool) – switch to control whether “Avg TPB path length” will be calculated

Returns:

data_df (DataFrame) – Dataframe with data from subvolume statistic calculations
error_df (DataFrame) – Dataframe with low and high errors (and std. dev.) calculated using n_bootstrap samples

Example

>>> from FIBbootstrap.tpb import bootstrap_tpb_stats
>>> inputs_dict = {
...     'active':'smoothActive.savg.mv3d',
...     'inactive':'smoothInactive.savg.mv3d',
...     'unknown':'smoothUnknown.savg.mv3d'}
>>> data_out, \
... error_out = bootstrap_tpb_stats(inputs_dict,
...                                 n_volumes=500,
...                                 box_size=4000,
...                                 save_output=False,
...                                 data_output_fname='data_N500_s4000.csv',
...                                 err_output_fname='errors_N500_s4000.csv',
...                                 output_avg=False)

FIBbootstrap.tpb.read_mv3d(filename)[source]¶

Get number of lines and points, as well as the 3d data contained withing an MV3D network file

Parameters:	filename (str) – Name of `.mv3d` file to open
Returns:	data (`ndarray`) – (N x 4) numpy array containing the index, x, y, and z coordinates of each point within the network (the last value, `d` is discarded) num_lines (`int`) – number of lines contained in the network (read from line 2 of the file) num_points (`int`) – number of points contained in the network (read from line 3 of the file)

FIBbootstrap.tpb.write_mv3d(fname, data, d=0, overwrite=True)[source]¶

Output path data to an mv3d file, which can be read by Avizo (and other software)

Parameters:

fname (str) – Filename to which to write; will be overwritten if it exists (by default)
data (ndarray) – array containing network (spatial graph) data in the same format as output by read_mv3d() or crop_tpb_data()
d (ndarray) – the value to be written in the ‘thickness’ column of the mv3d file. Can be used to tag files with a scalar value. If a single number is given instead of an array, that number will be used for every point. If a numpy array (the same length as the data), the values will be specific to each point. Standard pandas/numpy broadcasting rules apply
overwrite (bool) – switch to control whether an existing file will be clobbered if it already exists

FIBbootstrap.tpb.crop_tpb_data(data, box_start, box_end)[source]¶

Crop TPB data to only contain points within the box defined by the corners box_start and box_end

Parameters:	data (`ndarray`) – (N x 4) numpy array with TPB data (as loaded by `read_mv3d()`), including the first column containing the index box_start (list or `ndarray`) – x, y, z coordinates of lower bound corner to crop inside of box_end (list or `ndarray`) – x, y, z coordinates of upper bound corner to crop inside of
Returns:	cropped_data – copy of the original data, containing only the points inside of the crop box
Return type:	`ndarray`

FIBbootstrap.tpb.get_bb_from_data(data_list)[source]¶

Infer from the the existing data the extents of the bounding box

Parameters:	data_list (list) – list or array of numpy arrays with data (in the format returned by `read_mv3d()` or `crop_tpb_data()`. Expects 4 columns representing [‘id’, ‘x’, ‘y’, ‘z’].
Returns:	min_bb (`tuple`) – list of x, y, z values containing the smallest coordinates in each dimension found in the data_list max_bb (`tuple`) – list of x, y, z values containing the largest coordinates in each dimension found in the data_list

FIBbootstrap.tpb.bb_volume(bb)[source]¶

Return the volume enclosed by a bounding box.

Parameters:	bb (`tuple` of length 2) – tuple of length two, each term should be an iterable of length three with the minimum (position 0) and maximum (position 1) bounding box coordinate in each dimension
Returns:	volume – volume enclosed by the bounding box
Return type:	float

FIBbootstrap.tpb.get_random_subvolume(bb, size)[source]¶

Given a particular bounding box and subvolume size, return two lists with x, y, z coordinates of lower and upper corners of a random subvolume withing the bounding box. Returned subvolume will be completely enclosed within bb (i.e. the maximum position for the lower bound of the cube x value will be bb[0] - size).

Parameters:

bb (tuple) – tuple of length two, each term should be an iterable of length three with the minimum (position 0) and maximum (position 1) bounding box coordinate in each dimension from which to take a random volume; returned volume will be in the range [min_x, max_x], [min_y, max_y], and [min_z, max_z]
size (number) – size of cube to return (single edge length, so total volume enclosed will be size**3)

Returns:

box_start (list) – lower bound corner of the subvolume box (x, y, and z)
box_end (list) – upper bound corner of the subvolume box (x, y, and z

FIBbootstrap.tpb.path_length(path_df)[source]¶

Calculate the length along a path.

Parameters:	path_df (DataFrame) – path_df should have 4 columns, ‘id’, ‘x’, ‘y’, ‘z’. Using `scipy.spatial.distance.pdist()`, this method will calculate the sum of the distances between successive rows of (x, y, z) coordinates in the dataframe
Returns:	length – total sum of path defined by successive points in `path_df`
Return type:	float

FIBbootstrap.tpb.network_length(data)[source]¶

Calculate the total length of a network that has been imported from the mv3d format

Parameters:	data (ndarray) – network data, in the format returned by `read_mv3d()` or `crop_tpb_data()`
Returns:	lengths (`ndarray`) – lengths of each individual segment within the network tot_length (`float`) – total length of network

FIBbootstrap.tpb.split_paths_in_network(data, threshold=0.2)[source]¶

Given an array of data, try to detect the paths that are on the edge of the volume and split the single path (containing a sudden long jump) into multiple paths.

Parameters:	data (`ndarray`) – array of data, like that returned by `read_mv3d()` or `crop_tpb_data()` threshold (float) – weighting threshold in the range [0, 1] to help determine what is an outlier. All the data in each path is fit by a robust linear model, so any length values that are significantly different should have a weight << 1.0. Set this value higher to find more outliers, and thus split more paths. Set it lower to be more conservative.
Returns:	split_data – copy of the original data, but with paths split at points that caused particularly large jumps from point to point
Return type:	`ndarray`

FIBbootstrap.tpb.animate_cropped_data(data_a, data_i, data_u, size=4000, subsample=1, bb=None)[source]¶

Plot a simple animation of the total TPB path network, as well as the network contained within a randomly cropped volume

Parameters:

data_a (ndarray) – network data, in the format returned by read_mv3d() or crop_tpb_data(); will be plotted in green (active)
data_i (ndarray) – network data, in the format returned by read_mv3d() or crop_tpb_data(); will be plotted in red (inactive)
data_u (ndarray) – network data, in the format returned by read_mv3d() or crop_tpb_data(); will be plotted in yellow (unknown)
size (number) – size of cube to return (single edge length, so total volume enclosed will be size**3)
subsample (int) – factor by which to subsample the total tpb network. If > 1, it will speed up the plotting (may be helpful on lower-powered CPUs)
bb (None or tuple of length 2) – tuple of length two, each term should be an iterable of length three with the minimum (position 0) and maximum (position 1) bounding box coordinate in each dimension from which to take a random volume; returned volume will be in the range [min_x, max_x], [min_y, max_y], and [min_z, max_z] If None, the bounding box will be inferred from the supplied data

FIBbootstrap.tpb.get_bb_lines(bb)[source]¶

Given a bounding box, find the necessary vectors that will allow for plotting of the edges of the bounding box using MayaVi

Parameters:	bb (`tuple` of length 2) – tuple of length two, each term should be an iterable of length three with the minimum (position 0) and maximum (position 1) bounding box coordinate in each dimension from which to take a random volume; returned volume will be in the range
Returns:	x (`ndarray`) – array of x positions for `mayavi.mlab.plot3d()` for plotting edges of 3D rectangle y (`ndarray`) – array of y positions for `plot3d()` for plotting edges of 3D rectangle z (`ndarray`) – array of z positions for `plot3d()` for plotting edges of 3D rectangle

FIBbootstrap.tpb.get_box_and_corners(box_start, box_end)[source]¶

Given two opposite corners of a 3D rectangle, find the necessary vectors that will allow for plotting of the edges of the rectangle and points at each of the corners using MayaVi

Parameters:

box_start (iterable) – lower bounding corner of the rectangle (i.e. [x, y, z])
box_end (iterable) – upper bounding corner of the rectangle (i.e. [x, y, z])

Returns:

x (ndarray) – array of x positions for mayavi.mlab.plot3d() for plotting edges of 3D rectangle
y (ndarray) – array of y positions for plot3d() for plotting edges of 3D rectangle
z (ndarray) – array of z positions for plot3d() for plotting edges of 3D rectangle
x_p (ndarray) – array of x positions for plot3d() for plotting corners of 3D rectangle
y_p (ndarray) – array of y positions for plot3d() for plotting corners of 3D rectangle
z_p (ndarray) – array of z positions for plot3d() for plotting corners of 3D rectangle

Utilities¶

FIBbootstrap.utils.calculate_errors(df, samples)[source]¶

Calculate the “error bars” of each column in a Pandas dataframe

Parameters:	df (DataFrame) – dataframe on which to calculate samples (int) – number of bootstrap samples to use
Returns:	result – dataframe with `-` and `+` error values (and mean) for each column in df
Return type:	DataFrame