Multi-Session Processing Guide#
This guide covers processing multiple 2-photon microscopy recordings as a session, with inter-sequence alignment and valid mask computation.
Overview#
Multi-session processing is essential for longitudinal studies where you record from the same field of view across multiple time points or conditions. The session module provides:
Motion correction of individual recordings
Cross-registration between recordings
Valid mask computation for consistent analysis regions
HPC support for large-scale processing
When to Use Session Processing#
Use session processing when you have:
Multiple recordings from the same field of view
Longitudinal imaging sessions (days/weeks apart)
Different experimental conditions on same neurons
Need for pixel-perfect alignment across recordings
Basic Workflow#
1. Prepare Your Data#
Organize recordings in a single directory:
experiment/
├── baseline_001.tif
├── baseline_002.tif
├── stimulus_001.tif
├── stimulus_002.tif
└── recovery_001.tif
2. Create Configuration#
Create session.toml:
# Data location
root = "/path/to/experiment/"
pattern = "*.tif"
# Optional: specify center reference
# center = "baseline_002.tif"
# Output paths
output_root = "compensated_outputs"
final_results = "final_results"
# Processing options
resume = true
scheduler = "local"
n_workers = -1 # Stage 1 workers (-1 = all CPU cores)
# Optical flow backend
flow_backend = "flowreg"
# Stage 2 alignment parameters
cc_upsample = 4 # Subpixel accuracy
sigma_smooth = 6.0 # Gaussian smoothing
alpha_between = 25.0 # Regularization
iterations_between = 100
stage2_constancy_assumption = "gc" # Options: "gc", "gray", "cs"
3. Run Processing#
Option A: Python Script
from pyflowreg.session import SessionConfig
from pyflowreg.session.stage1_compensate import run_stage1
from pyflowreg.session.stage2_between_avgs import run_stage2
from pyflowreg.session.stage3_valid_mask import run_stage3
# Load configuration
config = SessionConfig.from_toml("session.toml")
# Run all stages
output_folders = run_stage1(config)
middle_idx, center_file, displacements = run_stage2(config)
final_mask = run_stage3(config, middle_idx, displacements)
Option B: Command Line
# Run complete pipeline
pyflowreg-session run --config session.toml
# Or run stages individually
pyflowreg-session run --config session.toml --stage 1
pyflowreg-session run --config session.toml --stage 2
pyflowreg-session run --config session.toml --stage 3
Deep Dive: Session Pipeline#
The pyflowreg.session pipeline always runs the same three deterministic stages and records progress in status.json files so every step can resume safely. Knowing what each stage consumes and produces makes it easier to debug, scale to clusters, or mix MATLAB and Python tooling.
Stage overview#
Stage |
Inputs |
Key operations |
Primary outputs |
Resume markers |
|---|---|---|---|---|
1. Per-recording compensation |
Recordings discovered with |
|
|
|
2. Inter-sequence alignment |
Stage 1 temporal averages, center from |
|
|
|
3. Valid mask & reprojection |
Stage 1 masks, Stage 2 displacement fields |
|
|
|
Stage 1 – Per-recording compensation#
discover_input_files()builds the sorted worklist before any processing begins (seesrc/pyflowreg/session/stage1_compensate.py).Each recording gets its own directory under
output_root/<stem>/and runs throughcompensate_recording()withsave_valid_idx=True, producing the motion-corrected video andidx.hdfmask stack.After compensation, the helper verifies that
compensated.hdf5(case-insensitive) contains the expected frame count to guard against half-written files.Temporal averages are streamed with
compute_and_save_temporal_average(), so even very long sequences never have to fit entirely in RAM.
Outputs: compensated.hdf5, idx.hdf, temporal_average.npy, status.json with "stage1": "done", plus optional statistics.npz.
Stage 2 – Inter-sequence alignment#
Temporal averages are reloaded from disk and the reference recording (center) is selected automatically or from
SessionConfig.center.compute_between_displacement()smooths both averages, applies phase cross-correlation for a rigid guess, then refines with the configured flow backend (src/pyflowreg/session/stage2_between_avgs.py).stage2_constancy_assumptioncontrols the Stage 2 data term. The default"gc"preserves MATLAB Flow-Registration behavior;"cs"enables the census term for the nativeflowregbackend.Results are written to
w_to_reference.npz(separateu/varrays) so MATLAB users can load them directly.
Outputs: w_to_reference.npz, per-recording status.json updates, and middle_idx (0-based) pointing to the reference average.
Stage 3 – Valid mask & aligned stacks#
load_idx_and_compute_mask()reduces eachidx.hdfto a boolean mask via a temporal AND; raw and aligned masks are saved for inspection.Masks are warped into the session reference frame with
imregister_binary(), mirroring MATLAB’sget_session_valid_index_v3.Each
compensated.hdf5can optionally be reprojected into the reference grid viaalign_sequence_video()— controlled bySessionConfig.align_chunk_sizeandSessionConfig.align_output_format.The final mask is the logical AND of all aligned masks; the routine writes
.npz,.mat, and.pngvariants and caches aligned videos asaligned_<recording>.<ext>.
Outputs: final_valid_idx.{png,npz,mat}, per-recording *_valid_idx.png/*_valid_idx_aligned.png, optional aligned_<recording>.<ext>, and final_results/status.json.
Result artifacts at a glance#
compensated_outputs/
└── recording_X/
├── compensated.hdf5
├── idx.hdf
├── temporal_average.npy
├── w_to_reference.npz
└── status.json
final_results/
├── final_valid_idx.{png,npz,mat}
├── aligned_recording_X.tif # format set by align_output_format
└── status.json
Resume strategy#
Stage 1 re-runs automatically if
compensated.hdf5is missing or its frame count no longer matches the source.Stage 2 skips work whenever
w_to_reference.npzalready exists.Stage 3 checks both
final_valid_idx.pngandfinal_results/status.jsonbefore recomputing.Because each artifact is deterministic, deleting a corrupt file and re-running just that stage is safe.
CLI & scheduler coordination#
pyflowreg-session runorchestrates all three stages locally;run_stage1_array()lets HPC array jobs handle Stage 1 while Stages 2–3 run as follow-up jobs.The CLI prints reminders about running Stages 2–3 once all array tasks finish, and you can always call the Python APIs directly.
For Dask,
pyflowreg-session daskspins up a jobqueue cluster, parallelizes Stage 1, then executes Stages 2–3 serially on the client.
Advanced Configuration#
Quality vs Speed Trade-offs#
[flow_options]
quality_setting = "fast" # Options: fast, balanced, quality
buffer_size = 1000 # Frames per batch
save_w = false # Don't save displacement fields
save_valid_idx = true # Required for Stage 3
# constancy_assumption = "cs" # Optional Stage 1 data term override
Alternatively, point to a saved MATLAB/Python options file:
flow_options = "./saved_options/session_stage1.json"
GPU Acceleration#
Use PyTorch backend with CUDA:
flow_backend = "flowreg_torch"
[backend_params]
device = "cuda:0"
Custom Center Reference#
By default, the lexicographic middle file is used as reference. Override with:
center = "specific_recording.tif"
HPC / Cluster Processing#
SLURM Array Jobs#
For large datasets, process Stage 1 in parallel:
submit_stage1.sh:
#!/bin/bash
#SBATCH --job-name=session_stage1
#SBATCH --array=1-20%5 # 20 recordings, max 5 parallel
#SBATCH --time=2:00:00
#SBATCH --mem=16G
#SBATCH --cpus-per-task=4
module load python/3.9
python -c "
from pyflowreg.session import SessionConfig
from pyflowreg.session.stage1_compensate import run_stage1_array
config = SessionConfig.from_toml('session.toml')
config.scheduler = 'array'
run_stage1_array(config)
"
submit_stages23.sh:
#!/bin/bash
#SBATCH --job-name=session_stages23
#SBATCH --dependency=afterok:${STAGE1_JOB_ID}
#SBATCH --time=1:00:00
#SBATCH --mem=32G
module load python/3.9
python -c "
from pyflowreg.session import SessionConfig
from pyflowreg.session.stage2_between_avgs import run_stage2
from pyflowreg.session.stage3_valid_mask import run_stage3
config = SessionConfig.from_toml('session.toml')
middle_idx, center_file, displacements = run_stage2(config)
final_mask = run_stage3(config, middle_idx, displacements)
"
Submit sequence:
STAGE1_JOB=$(sbatch submit_stage1.sh | awk '{print $4}')
sbatch --dependency=afterok:${STAGE1_JOB} submit_stages23.sh
SGE/PBS Support#
The session module auto-detects array task IDs from:
SLURM_ARRAY_TASK_IDSGE_TASK_IDPBS_ARRAY_INDEX
Understanding the Output#
Directory Structure#
After processing:
experiment/
├── compensated_outputs/
│ ├── baseline_001/
│ │ ├── compensated.hdf5 # Motion-corrected video
│ │ ├── temporal_average.npy # Mean image
│ │ ├── idx.hdf # Per-frame valid masks
│ │ ├── w_to_reference.npz # Displacement to reference
│ │ └── status.json # Processing status
│ ├── baseline_002/
│ │ └── ...
│ └── ...
└── final_results/
├── final_valid_idx.png # Visual mask
├── final_valid_idx.npz # Complete results
├── final_valid_idx.mat # MATLAB compatible
├── *_valid_idx.png # Per-sequence masks
└── *_valid_idx_aligned.png # Aligned masks
Loading Results#
import numpy as np
from pathlib import Path
# Load final mask
results = np.load("final_results/final_valid_idx.npz")
final_mask = results['final_valid']
# Access all data
print(f"Valid pixels: {np.sum(final_mask)}/{final_mask.size}")
print(f"Reference recording: {results['middle_idx']}")
# Load motion-corrected videos
from pyflowreg.util.io.factory import get_video_file_reader
reader = get_video_file_reader("compensated_outputs/baseline_001/compensated.hdf5")
video = reader[:]
reader.close()
# Apply mask to analysis
masked_video = video[:, final_mask] # Shape: (T, n_valid_pixels)
Troubleshooting#
Memory Issues#
Problem: Stage 1 runs out of memory
Solution: Reduce buffer size:
config.flow_options = {"buffer_size": 500}
run_stage1(config)
Incomplete Files#
Problem: Crashed job left incomplete HDF5
Solution: Session module auto-detects and reruns:
Verifies frame count matches input
Uses atomic writes for crash safety
Resume enabled by default
Poor Alignment#
Problem: Recordings don’t align well
Solutions:
Increase iterations:
iterations_between = 200
Adjust regularization:
alpha_between = 15.0 # Lower = less smooth
Manually select better reference:
center = "clearest_recording.tif"
Array Job Failures#
Problem: Some array tasks fail
Solution: Resubmit only failed tasks:
# Check which completed
ls compensated_outputs/*/status.json | wc -l
# Rerun specific task
SLURM_ARRAY_TASK_ID=5 python -c "..."
Best Practices#
1. Start Small#
Test on subset first:
config.pattern = "*_001.tif" # Test with first of each condition
2. Verify Stage 1#
Check temporal averages before proceeding:
import matplotlib.pyplot as plt
for folder in output_folders:
avg = np.load(folder / "temporal_average.npy")
plt.figure()
plt.imshow(avg, cmap='gray')
plt.title(folder.name)
3. Monitor Displacement Magnitudes#
Large displacements indicate problems:
for w in displacement_fields:
magnitude = np.sqrt(w[..., 0]**2 + w[..., 1]**2)
print(f"Max displacement: {np.max(magnitude):.1f} pixels")
4. Save Intermediate Results#
Enable for debugging:
[flow_options]
save_w = true # Save displacement fields
save_meta_info = true # Save statistics
Integration with Analysis#
CaImAn Integration#
import caiman as cm
# Load using final mask
imgs = cm.load("compensated_outputs/*/compensated.hdf5")
imgs = imgs[:, final_mask]
# Run CNMF with consistent ROIs across sessions
cnm = cm.source_extraction.cnmf.CNMF(...)
cnm.fit(imgs)
Suite2P Integration#
# Export masked videos for Suite2P
for h5_path in results['compensated_h5_paths']:
video = load_video(h5_path)
masked = video[:, final_mask]
save_for_suite2p(masked)
Performance Optimization#
Multi-threading Control#
Prevent thread oversubscription in parallel processing:
import os
os.environ['OMP_NUM_THREADS'] = '1'
os.environ['MKL_NUM_THREADS'] = '1'
Batch Size Tuning#
Larger batches = better performance but more memory:
# For 16GB RAM
config.flow_options = {"buffer_size": 1000}
# For 64GB RAM
config.flow_options = {"buffer_size": 5000}
Storage Considerations#
Use local scratch for temporary files
Output to parallel filesystem (Lustre/GPFS)
Enable compression for final outputs:
config.flow_options = {"compression": "gzip"}
MATLAB Interoperability#
Results are fully compatible with MATLAB Flow-Registration:
% Load Python results in MATLAB
load('final_results/final_valid_idx.mat');
% Use with MATLAB analysis
masked_pixels = video(final_valid);
Mix processing stages:
# Stage 1 in MATLAB
matlab -batch "align_full_v3_checkpoint('session.toml')"
# Stages 2-3 in Python
pyflowreg-session run --config session.toml --stage 2
pyflowreg-session run --config session.toml --stage 3