Music-Driven Enhanced Dance Performance Generation by Integrating Seq2Seq and Human Pose Recognition

Wen He

doi:10.28991/HIJ-2026-07-02-011

Authors

Wen He
hewen202577808@163.com
First College of Arts, Chengdu Sport University, Chengdu, 641418, China https://orcid.org/0009-0007-3247-0735

Vol. 7 No. 2 (2026): June

Research Articles

Downloads

PDF

Abstract
How to Cite
Metrics
References
License

To address the accuracy bottleneck in the naturalness and rhythm synchronization of music-driven dance generation, an enhanced dance generation model integrating sequence-to-sequence modeling and human pose recognition was developed to improve the synchronization, naturalness, and structural consistency of generated movements. The model uses multi-scale music features as input, extracts temporal music semantics through a bidirectional long short-term memory network and an attention mechanism, and optimizes motion structure by incorporating skeleton keypoint feedback, thereby achieving joint modeling of music semantics and human motion. Experimental results on the AIST++ and DanceTrack datasets demonstrate that the proposed model achieves a beat alignment error as low as 0.12 s, a joint point error of 11.2 px, and a motion smoothness score of 2.41. In the generation of a 90-second dance sequence, the beat error is reduced by more than 32% compared with mainstream models, and the model achieves a high score of 0.97 in the evaluation of complex dance symmetries such as “arm-lifting rotation.” These results indicate that the joint modeling of music semantics and skeletal structure effectively improves movement coordination and rhythm matching in dance generation, enabling the production of natural and coordinated dance movements adaptable to different dance styles.

[1] Xinlei, S. (2023). Folk dance and music art of the new generation: China's experience. Voprosy Istorii, 3(1), 170-177. doi:10.31166/voprosyistorii202303statyi31.

[2] Han, B., Li, Y., Shen, Y., Ren, Y., & Han, F. (2024). Dance2MIDI: Dance-driven multi-instrument music generation. Computational Visual Media, 10(4), 791–802. doi:10.1007/s41095-024-0417-1.

[3] He, D. (2025). Seq2Seq Text Recognition Method for Large-Scale Corpus Linguistics Knowledge Based on Transformer. International Journal of High Speed Electronics and Systems, 34(01), 2540069. doi:10.1142/S0129156425400695.

[4] Li, K., & Santos, E. (2024). Artificial Intelligence Choreography: 3D Dance Generation Based on Deep Generative Adversarial Networks. Journal of Network Intelligence, 9(3), 1725–1741. doi:10.6025/jni/2024/9/3/1725-1741.

[5] Kim, W., Sung, J., Saakes, D., Huang, C., & Xiong, S. (2021). Ergonomic postural assessment using a new open-source human pose estimation technology (OpenPose). International Journal of Industrial Ergonomics, 84, 103164. doi:10.1016/j.ergon.2021.103164.

[6] Zhou, Z., Huo, Y., Huang, G., Zeng, A., Chen, X., Huang, L., & Li, Z. (2025). QEAN: quaternion-enhanced attention network for visual dance generation. Visual Computer, 41(2), 961–973. doi:10.1007/s00371-024-03376-5.

[7] Zeng, D. (2025). AI-Powered Choreography Using a Multilayer Perceptron Model for Music-Driven Dance Generation. Informatica (Slovenia), 49(20), 137–148. doi:10.31449/inf.v49i20.8103.

[8] Yang, Z., Wen, Y. H., Chen, S. Y., Liu, X., Gao, Y., Liu, Y. J., Gao, L., & Fu, H. (2024). Keyframe Control of Music-Driven 3D Dance Generation. IEEE Transactions on Visualization and Computer Graphics, 30(7), 3474–3486. doi:10.1109/TVCG.2023.3235538.

[9] Kim, J., Kwon, B., Kim, J., & Lee, S. (2023). MNET++: Music-Driven Pluralistic Dancing Toward Multiple Dance Genre Synthesis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(12), 15036–15050. doi:10.1109/TPAMI.2023.3312092.

[10] Au, H. Y., Chen, J., Jiang, J., & Guo, Y. (2024). ReChoreoNet: Repertoire-based Dance Re-choreography with Music-conditioned Temporal and Style Clues. Machine Intelligence Research, 21(4), 771–781. doi:10.1007/s11633-023-1478-9.

[11] Wang, Q., Tong, G., & Zhou, S. (2023). A Study of Dance Movement Capture and Posture Recognition Method Based on Vision Sensors. HighTech and Innovation Journal, 4(2), 283–293. doi:10.28991/HIJ-2023-04-02-03.

[12] Jhansi Rani, C., & Devarakonda, N. (2023). Generative adversarial network based data augmentation and quantum based convolution neural network for the classification of Indian classical dance forms. Journal of Intelligent and Fuzzy Systems, 45(4), 6107–6125. doi:10.3233/JIFS-231183.

[13] Zhou, Q., Jiang, D. L., & Wang, G. (2024). 3D Dance Movement Recognition Based on Somatic Interaction Devices and Neural Networks. Journal of Network Intelligence, 9(4), 2290–2303.

[14] Bao, C., & Sun, Q. (2023). Generating Music with Emotions. IEEE Transactions on Multimedia, 25, 3602–3614. doi:10.1109/TMM.2022.3163543.

[15] Liang, X., Li, W., Huang, L., & Gao, C. (2024). DanceComposer: Dance-to-Music Generation Using a Progressive Conditional Music Generator. IEEE Transactions on Multimedia, 26(6), 10237–10250. doi:10.1109/TMM.2024.3405734.

[16] Cai, X., Wang, T., Lu, R., Jia, S., & Sun, H. (2023). Automatic generation of Labanotation based on human pose estimation in folk dance videos. Neural Computing and Applications, 35(35), 24755–24771. doi:10.1007/s00521-023-08206-8.

[17] Li, W., Wu, L., Wen, X., Feng, Q., Zhou, T., Yang, L., & Yin, Z. (2024). Runoff simulation study based on LSTM-Seq2seq model optimized by attention mechanism. Journal of Glaciology and Geocryology, 46(3), 980–992. doi:10.7522/j.issn.1000-0240.2024.0078.

[18] Li, W., Li, K., Yue, Y., Wang, J., Xu, H., & Luo, Y. (2024). ISAR Range Alignment Based on a Spatiotemporal Attention-Seq2Seq Network. Journal of Signal Processing, 40(9), 1659–1673. doi:10.12466/xhcl.2024.09.008.

[19] Yang, L., Wei, C., Yang, J., Ma, J., Guo, H., Cheng, L., & Li, Z. (2024). Seq2Seq-AFL: Fuzzing via sequence-to-sequence model. International Journal of Machine Learning and Cybernetics, 15(10), 4403–4421. doi:10.1007/s13042-024-02153-z.

[20] Tingting, L., Bo, L., & Chunzhu, L. (2024). Aircraft trajectory prediction within terminal area based on Seq2Seq-attention model. Science Technology and Engineering, 24(9), 3882-3895.

[21] Huang, J., Huang, X., Yang, L., & Tao, Z. (2024). Dance-conditioned artistic music generation by creative-GAN. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, 107(5), 836-844. doi:10.1587/transfun.2023EAP1059.

[22] Piekut, B. (2024). Sound against Music. TDR - The Drama Review - A Journal of Performance Studies, 68(2), 35–54. doi:10.1017/S1054204324000066.

[23] Zhang, C., Zhang, H., Pu, T., & Pan, J. (2025). Supply Chain Demand Forecasting Based on Data Mining Algorithm and Seq2Seq. International Journal of Control, Automation and Systems, 23(1), 89–104. doi:10.1007/s12555-024-0141-8.

[24] Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv Preprint, arXiv:1409.0473. doi:10.48550/arXiv.1409.0473.

[25] Shi, Y., & Han, S. (2025). Multimedia interactive creative dance choreography integrating intelligent chaotic art algorithms. Journal of Computational Methods in Sciences and Engineering, 25(4), 2976–2991. doi:10.1177/14727978251318055.

[26] Zhou, Q., Li, M., Zeng, Q., Aristidou, A., Zhang, X., Chen, L., & Tu, C. (2023). Let’s all dance: Enhancing amateur dance motions. Computational Visual Media, 9(3), 531–550. doi:10.1007/s41095-022-0292-6.

[27] Siyao, L., Yu, W., Gu, T., Lin, C., Wang, Q., Qian, C., Loy, C. C., & Liu, Z. (2023). Bailando++: 3D Dance GPT with Choreographic Memory. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(12), 14192–14207. doi:10.1109/TPAMI.2023.3319435.

[28] Hasanvand, M., Nooshyar, M., Moharamkhani, E., & Selyari, A. (2023). Machine Learning Methodology for Identifying Vehicles Using Image Processing. Artificial Intelligence and Applications, 1(3), 154–162. doi:10.47852/bonviewAIA3202833.

[29] Cheng, Y., Jiang, Y., & Wang, Y. (2024). Music-stylized hierarchical dance synthesis with user control. Virtual Reality and Intelligent Hardware, 6(5), 339–357. doi:10.1016/j.vrih.2024.06.004.

[30] Jiang, H., & Yan, Y. (2024). Sensor based Dance Coherent Action Generation Model using Deep Learning Framework. Scalable Computing: Practice and Experience, 25(2), 1073–1090. doi:10.12694/scpe.v25i2.2648.

Acceptance Rate:	27%
Review Speed:	61 days
Issue Per Year:	4
Number of Volumes:	5
Number of Issues:	19
Number of Articles:	193
Number of Reviewers:	372
Number of Contributors:	530
Contributing Countries:	63
No. of Scopus Citations:	1289
No. of WoS Citations:	1187
No. of Google Citations:	1470
Google h-index:	21
Google i10-index:	45
Abstract Views:	123,086
PDF Download:	103,923

Music-Driven Enhanced Dance Performance Generation by Integrating Seq2Seq and Human Pose Recognition

Authors

Downloads

Downloads

Login

submission

Publisher & Affiliated Societies

Indexing & Abstracting

SidebarMenu

Journal Imprint

Most Cited Articles

Towards Bayesian Quantification of Permeability in Micro-scale Porous Structures – The Database of Micro Networks

Physicochemical and Microstructural Characterization of Klias Peat, Lumadan POFA, and GGBFS for Geopolymer Based Soil Stabilization

Seismic Upgradation of RC Beams Strengthened with Externally Bonded Spent Catalyst Based Ferrocement Laminates

Temporal Trends of Rainfall and Temperature over Two Sub-Divisions of Western Ghats

IndexedBy

Indexed In

twitter

Social Media

Analytics

Analytics

Information

Address

Contact Info:

Music-Driven Enhanced Dance Performance Generation by Integrating Seq2Seq and Human Pose Recognition

Authors

Downloads

Downloads

Login

submission

Publisher & Affiliated Societies

Indexing & Abstracting

SidebarMenu

Journal Imprint

Journal Imprint

Journal Metrics

Most Cited Articles

Towards Bayesian Quantification of Permeability in Micro-scale Porous Structures – The Database of Micro Networks

Physicochemical and Microstructural Characterization of Klias Peat, Lumadan POFA, and GGBFS for Geopolymer Based Soil Stabilization

Seismic Upgradation of RC Beams Strengthened with Externally Bonded Spent Catalyst Based Ferrocement Laminates

Temporal Trends of Rainfall and Temperature over Two Sub-Divisions of Western Ghats

IndexedBy

Indexed In

twitter

Social Media

Analytics

Analytics

Information