Scientific Publications

Journal Articles

Find all the journal articles and conference papers produced by the ELIAS partners, presenting the latest scientific findings of the project.

Learn More

Scientific Publications | Page 1

GraphMLP: A graph MLP-like architecture for 3D human pose estimation

Abstract: Modern multi-layer perceptron (MLP) models have shown competitive results in learning visual representations without self-attention. However, existing MLP models are not good at capturing local details and lack prior knowledge of human body configurations, which limits their modeling power for skeletal representation learning. To address these issues, we propose a simple yet effective graph-reinforced MLP-Like architecture, named GraphMLP, that combines MLPs and graph convolutional networks (GCNs) in aglobal-local-graphicalunified architecture for 3D human pose estimation. GraphMLP incorporates the graph structure of human bodies into an MLP model to meet the domain-specific demand of the 3D human pose, while allowing for both local and global spatial interactions. Furthermore, we propose to flexibly and efficiently extend the GraphMLP to the video domain and show that complex temporal dynamics can be effectively modeled in a simple waywith negligible computational cost gains in the sequence length. To the best of our knowledge, this is the first MLP-Like architecture for 3D human pose estimation in a single frame and a video sequence. Extensive experiments show that the proposed GraphMLP achieves state-of-the-art performance on two datasets, i.e., Human3.6M and MPI-INF-3DHP. Code and models are available at https://github.com/Vegetebird/GraphMLP.

Type of Publication: Journal article

Title of Journal:Pattern Recognition, 158, 2025.

Authors: Li, Wenhao; Liu, Mengyuan; Liu, Hong; Guo, Tianyu; Wang, Ti; Tang, Hao; Sebe, Nicu

access here

HYRE: Hybrid Regressor for 3D Human Pose and Shape Estimation

Abstract: Regression-based 3D human pose and shape estimation often fall into one of two different paradigms. Parametric approaches, which regress the parameters of a human body model, tend to produce physically plausible butimage-mesh misalignment results. In contrast, non-parametric approaches directly regress human mesh vertices, resulting in pixel-aligned but unreasonable predictions. In this paper, we consider these two paradigms together for a better overall estimation. To this end, we propose a novel HYbrid REgressor (HYRE) that greatly benefits from the joint learning of both paradigms. The core of our HYRE is a hybrid intermediary across paradigms that provides complementary clues to each paradigm at the shared feature level and fuses their results at the part-based decision level, there by bridging the gap between the two. We demonstrate the effectiveness of the proposed method through both quantitative and qualitative experimental analyses, resulting in improvements for each approach and ultimately leading to better hybrid results. Our experiments show that HYRE outperforms previous methods on challenging 3D human pose and shape benchmarks.

Type of Publication: Journal article

Title of Journal: IEEE Transactions on Image Processing, 34(1), 235-246, 2025.

Authors: Li, Wenhao; Liu, Mengyuan; Liu, Hong; Ren, Bin; Li, Xia; You, Yingxuan; Sebe, Nicu

access here

Connectivity-Driven Pseudo-Labeling Makes Stronger Cross-Domain Segmenters

Abstract: Presently, pseudo-labeling stands as a prevailing approach in cross-domain semantic segmentation, enhancing model efficacy by training with pixels assignedwith reliable pseudo-labels. However, we identify two key limitations within this paradigm: (1) under relatively severe domain shifts, most selected reliable pixelsappear speckled and remain noisy. (2) when dealing with wild data, some pixelsbelonging to the open-set class may exhibit high confidence and also appear speck-led. These two points make it difficult for the pixel-level selection mechanism toidentify and correct these speckled close- and open-set noises. As a result, erroraccumulation is continuously introduced into subsequent self-training, leadingto inefficiencies in pseudo-labeling. To address these limitations, we propose anovel method called Semantic Connectivity-driven Pseudo-labeling (SeCo). SeCo formulates pseudo-labels at the connectivity level, which makes it easier to locateand correct closed and open set noise. Specifically, SeCo comprises two key com-ponents: Pixel Semantic Aggregation (PSA) and Semantic Connectivity Correction (SCC). Initially, PSA categorizes semantics into “stuff” and “things” categoriesand aggregates speckled pseudo-labels into semantic connectivity through efficientinteraction with the Segment Anything Model (SAM). This enables us not onlyto obtain accurate boundaries but also simplifies noise localization. Subsequently,SCC introduces a simple connectivity classification task, which enables us to locate and correct connectivity noise with the guidance of loss distribution. Extensive experiments demonstrate that SeCo can be flexibly applied to various cross-domain semantic segmentation tasks, i.e. domain generalization and domain adaptation, even including source-free, and black-box domain adaptation, significantly improv-ing the performance of existing state-of-the-art methods. The code is available at https://github.com/DZhaoXd/SeCo.

Type of Publication: Conference paper