E(3)-Pose
In this repository, we present E(3)-Pose, the first symmetry-aware framework for 6-DoF object pose estimation from volumetric images that uses an E(3)-equivariant convolutional neural network (E(3)-CNN). Although we evaluate the utility of E(3)-Pose on fetal brain MRI, the proposed methods hold potential for broader applications.
We rapidly estimate pose from volumes in a two-step process that separately estimates translation and rotation:
- Translation Estimation:
- A standard segmentation U-Net localizes the object in the volume.
- The center-of-mass (CoM) of the predicted mask is the estimated origin of the canonical object coordinate frame.
- Rotation Estimation:
- We crop input volumes such that the predicted segmentation mask is scaled to 60% of the cropped dimensions.
- The E(3)-CNN takes in the cropped volume as input, and outputs an E(3)-equivariant rotation parametrization consisting of 2 vectors and 1 pseudovector.
- The output rotation is computed by choosing the pseudovector direction that ensures right-handedness, and orthonormalizing via support-vector decomposition (SVD).
Our E(3)-CNN architecture builds on prior theoretical work on 3D steerable CNNs1 and uses code borrowed from e3nn-UNet2, which implements 3D convolutions with the e3nn3 Python library for building E(3)-equivariant networks.
Overall, E(3)-Pose outperforms state-of-the-art methods for pose estimation in fetal brain MRI volumes representative of clinical applications, including strategies that rely on anatomical landmark detection (Fetal-Align4), template registration (FireANTs5 and EquiTrack6), and direct pose regression with standard CNNs (3DPose-Net7, 6DRep8, RbR9). See figure below for example results. Particularly, we show in our paper that regularizing network parameters to conform with physical symmetries mitigates overfitting to research-quality training datasets, and permits better generalization to out-of-distribution clinical data with pose ambiguities.
The full article describing this method is available at:
Equivariant Symmetry-Aware Head Pose Estimation for Fetal MRI
Muthukrishnan, Gagoski, Lee, Grant, Adalsteinsson, Golland, Billot
arXiV (2025)
[ arxiv | bibtex | project page]
Installation
- Clone this repository.
- Edit the environment prefix in
environment.ymland then install all dependencies:
cd E3-Pose/
conda env create -f environment.yml
conda activate e3pose
pip install -r requirements.txt
- Install pytorch3d
- If you want to use our trained model weights for fetal brain MRI, download the model weights here.
- If you want to train your own network on a publicly available fetal MRI dataset10, download our manually annotated segmentations and poses here.
You're now ready to use E(3)-Pose!
Usage
This repository contains all the code necessary to train and test your own networks. We provide separate scripts for training the segmentation U-Net and E(3)-CNNs, and a single script to deploy both for full rigid pose estimation.
Training a Segmentation U-Net for Translation Estimation
Set up separate training/validation dataset directories for images and ground-truth segmentation labels, where file names between image and label directories are the same. Ensure that all image file extensions are .nii or .nii.gz.
If you are training a multi-class segmentation network, ensure that the object for which you want to estimate pose has category label 1 in the ground-truth labels.
Name the output directory to save all model weights and metrics during network training.
To train the segmentation U-Net, run:
python scripts/train_unet.py train_image_dir/ train_label_dir/ val_image_dir/ val_label_dir/ output_dir/For detailed descriptions of other arguments, run:
python scripts/train_unet.py -h
Training an E(3)-CNN for Rotation Estimation
Set up separate training/validation dataset directories for images and ground-truth segmentation labels, where file names between image and label directories are the same. Ensure that all image file extensions are .nii or .nii.gz. If your segmentation labels have multiple classes, ensure that the object for which you want to estimate pose has category label 1.
Set up separate CSV files for rotation annotations in training and validation datasets, in the following format:
frame_id rot_x rot_y rot_z ... ... ... ... where frame_id is the file name of the volume without the file extension, and rot_x, rot_y, rot_z are the Euler angles in degrees of the rotation from the volume to the canonical coordinate frame. The Euler angle rotation assumes the "xyz" ordering convention.
Name the output directory to save all model weights and metrics during network training.
To train the E(3)-CNN, run:
python scripts/train_e3cnn.py train_image_dir/ train_label_dir/ path_to_train_annotations.csv \ val_image_dir/ val_label_dir/ path_to_val_annotations.csv \ output_dir/For detailed descriptions of other arguments, run:
python scripts/train_e3cnn.py -h
Running Rigid Pose Estimation with Trained Model Weights
Set up an input directory of images (all file extensions must be .nii or .nii.gz) on which to run rigid pose estimation.
Name the output directory to save all estimated poses.
To estimate pose on all inputs, run:
python scripts.inference.py input_image_dir/ output_dir/ path_to_segmentation_unet.ckpt path_to_e3cnn.pthFor detailed descriptions of other arguments, run:
python scripts/inference.py -hOutput poses are saved as 4x4 transform matrices in .npy format in the output directory, where file names are the same as the inputs.
Citation/Contact
If you find this work useful for your research, please cite:
Equivariant Symmetry-Aware Head Pose Estimation for Fetal MRI
Muthukrishnan, Gagoski, Lee, Grant, Adalsteinsson, Golland, Billot
arXiV (2025)
[ arxiv | bibtex | project page]
If you have any question regarding the usage of this code, or any suggestions to improve it, please raise an issue
(preferred) or contact us at:
[email protected]
References
1 3D steerable CNNs: Learning rotationally equivariant features in volumetric data
Weiler, Geiger, Welling, Boomsma, Cohen
Advances in Neural Information Processing Systems, 2018
2 Leveraging SO(3)-steerable convolutions for pose-robust semantic segmentation in 3D medical data
Diaz, Geiger, McKinley
Journal of Machine Learning in Biomedical Imaging, 2024
3 e3nn: Euclidean neural networks
Geiger and Smidt
arXiV, 2022
4 Rapid head-pose detection for automated slice prescription of fetal-brain MRI
Hoffmann, Abaci Turk, Gagoski, Morgan, Wighton, Tisdall, Reuter, Adalsteinsson, Grant, Wald, van der Kouwe
International Journal of Imaging Systems and Technology, 2021
5 FireANTs: Adaptive Riemannian optimization for multi-scale diffeomorphic registration
Jena, Chaudhari, Gee
arXiV, 2024
6 SE(3)-equivariant and noise-invariant 3D rigid motion tracking in brain MRI
Billot, Dey, Moyer, Hoffmann, Abaci Turk, Gagoski
IEEE Transactions on Medical Imaging, 2024
7 Real-time deep pose estimation with geodesic loss for image-to-template rigid registration
Salehi, Khan, Erdogmus, Gholipour
IEEE Transactions on Medical Imaging, 2019
8 Automatic brain pose estimation in fetal MRI
Faghihpirayesh, Karimi, Erdogmus, Gholipour
Proceedings of SPIE: Medical Imaging: Image Processing, 2023
9 Registration by Regression (RbR): A framework for interpretable and flexible atlas registration
Gopinath, Hu, Hoffmann, Puonti, Iglesias
International Workshop on Biomedical Image Registration, 2024
10 The developing human connectome project fetal functional MRI release: Methods and data structures
Karolis et al
Imaging Neuroscience, 2025


