Markerless pose estimation of animals (particularly rodents) in unrestricted environments has played a key role in my
doctoral research. Markerless is important here - since it means no piercings/IR markers/odors on the body or skeleton
that might impede or affect natural behavior of the animal. Having access to reliable and remarkably accurate way of
extracting pose from videos has opened many new avenues of experimental designs in Neuroscience and Physics of
behavior that were previously closed. There are now quite a handful of such network architectures and toolboxes
and it is becoming increasingly hard to stay on top of all these cool methods. So, here I create a note of such methods
for future reference for me, and maybe you
might find a new tool to add to your arsenal as well.
I will do my best to update this post as I come across any new additions.
Oh wait - I almost forgot one of the most important aspect of these tools. These are open-source softwares which are free to use (yay!) and licensed with minimal restrictions. Go open research!
So here goes a list (in no specific order):
-
DeepLabCut (or DLC)
Deeplabcut uses transfer learning (using a pre-trained weights) and features a ResNet (an overview here) backbone with a few convolutional layers added on top to convolve to heatmaps. It uses transfer learning and uses pre-trained weights to ease up the training process, and does surprising well with a training dataset of just a few hundred frames. Since it first came about there has been quite a few significant updates to it, including nice GUIs for labelling and training process, 3D triangulation using 2 camera recordings, and recent support for multi-animal tracking. -
sLEAP Social LEAP (sLEAP) tries to tackle the social (aka multi-animal) tracking by taking two approaches - a top-down approach which looks for individual animals first and then tracks body parts on each animal (in single animal cropped images), and a bottom-up approach which finds all instances of bodyparts in the image and groups them together for different individuals. They propose their approach as backbone network agnostic and show examples with the ResNet and UNet architectures as the backbone.
-
DeepPoseKit DPK uses a stacked encode-decoder like model for tracking and provides a nice and slightly custom implementation of DLC and LEAP (precursor of sLEAP with a lightweight but capable network) architectures, employing connections between body-parts
in the heatmaps at the output layers. It also allows tradeoff between network speed and accuracy by cleverly allowing tuning of the heatmap dimension.
In addition to these networks, there has also been numerous other toolkits that enhance the capabilites of tracking toolkits above. In no particular order, these are -
-
Aniopose Anipose provides the ability for 3D tracking from keypoint detection in 2D images using Deeplabcut.
-
Open Monkey Studio Provides a very nice implementation for 3D tracking of Rhesus Macaques using multiple camera systems, and is backbone agnostic (DLC, DPK, LEAP).
-
LiftPose3D LiftPose3D is a tool that predicts 3D skeletons from 2D tracked points - from a single camera! You do require some 3D tracking to train the network first.
-
OptiFlex OptiFlex predicts subsequent heatmaps from model outputs (DLC, DPK, LEAP) using optical flow and uses a convolution to essentially do a weighted sum on a batch of consecutive optical flow predicted heatmaps to generate the final heatmap for bodypart inference.
-
DeepGraphPose DGP is built upon DLC and use a probabilistic graphical model that subjects the heatmaps from DLC to spatial and temporal constraints using structured variational inference.
[saleemlab]: