Disclaimer
The framework is in a very early development status. We plan stabilization of public APIs after the first release.
This project is a collaborative effort of researchers at the Jagiellonian University and the AGH University.
ChildLab is an organization at the JU Institute of Psychology dedicated to researching children's psychological development. This framework serves as a tool assisting the examination of parent-child interactions. It automates laborious manual data labeling thus allowing the researchers to focus on the proper analysis.
The main features of the framework are:
ChildLab Framework is designed in a component-based manner. You can freely match them and create complex pipelines, called procedures.
Observing scenes from multiple angles and using information between the frames of reference was always crucial for the project. ChildLab Framework provides methods of camera calibration, 3D transformation estimation and result visualization.
One of the main components is a YOLO-based pose detection. It can serve as a basis for gesture detection, classification, movement tracking and many other common use cases.
The gaze analysis component is based on mini-face our lightweight wrapper over the OpenFace library. It allows for easy and comprehensive estimation with a simple Python interface.
ChildLab Framework provides various novel methods of detecting joint attention - a shared focus of one or more individuals on some other object. It aims to fill the current research gaps in 3D attention tracking.
For the basic usage ChildLab Framework provides a CLI:
python3 -m child_lab_framework calibrate --help
python3 -m child_lab_framework estimate-transformations --help
python3 -m child_lab_framework process --help
The framework's components are exposed as a public API. Below you can find an example usage, for a fuller picture please refer to the demo.
from pathlib import Path
from child_lab_framework.core import video
from child_lab_framework.postprocessing import imputed_with_closest_known_reference
from child_lab_framework.task import pose, face, gaze, visualization
# Create video I/O components
reader = video.Reader(
video.Input(
'my_video',
Path('my_video.mp4'),
calibration=None
),
batch_size=10,
)
video_properties = reader.properties
writer = video.Writer(
Path('output.mp4'),
video_properties,
output_format=video.Format.MP4,
)
# Create a visualization component
visualizer = visualization.Visualizer(
properties=video_properties,
configuration=visualization.VisualizationConfiguration(),
)
# Perform computation on a GPU
device = torch.device('cuda')
# Prepare estimators
pose_estimator = pose.Estimator(
device,
input=video_properties,
max_detections=2,
threshold=0.5,
)
face_estimator = face.Estimator(
device,
input=video_properties,
confidence_threshold=0.5,
suppression_threshold=0.1,
)
gaze_estimator = gaze.Estimator(input=video_properties)
# Read frames
frames = reader.read_batch()
assert frames is not None
# Predict and impute poses
poses = imputed_with_closest_known_reference(
pose_estimator.predict_batch(frames)
)
# Predict and impute faces matched with poses
faces = (
imputed_with_closest_known_reference(
face_estimator.predict_batch(frames, poses)
)
if poses is not None
else None
)
# Estimate and impute gaze based on faces
gazes = (
imputed_with_closest_known_reference(
gaze_estimator.predict_batch(
frames,
faces,
)
)
if faces is not None
else None
)
# Visualize the results
annotated_frames = visualizer.annotate_batch(frames, poses, gazes)
# Save the results
writer.write_batch(annotated_frames)
- Basic framework structure
- Asynchronous processing
- Camera calibration
- Depth estimation
- 3D transformation estimation
- Pose estimation
- Face estimation
- Gaze analysis
- Joint attention detection
- 3D visualization
- Emotion recognition
- General web GUI
- PyPI deployment