Enhanced embedding learning for CTR prediction based on LightGCN
-
摘要:
基于特征交互建模方法的点击率预测问题经广泛探索已经取得较大进展,它能缓解有效信息损失,但在一定程度上依赖于不同特征的共同出现,存在特征稀疏问题.因此,针对交互过程特征出现次数少不能高效学习特征表示的问题,提出了一个基于轻量图卷积增强嵌入层学习的点击率预测模型LGCDFM(LightGCN with DeepFM).在初始嵌入层采用分而治之的学习策略,提出图结构中区分不同类型节点,首先由同类型节点信息传播确保特征出现频率,再由高阶连通的不同类型节点间交互捕捉多跳邻居信息.轻量图卷积神经结构强大的特征提取和表示学习能力,且摒弃无益于交互的特征变换和非线性激活函数,成为处理简单用户-项目交互数据的协同过滤任务的优势,有效减轻特征稀疏性问题.最后,表示学习层应用点击率预测经典模型DeepFM端到端学习高阶和低阶特征组合,由隐向量从稀疏数据中学习,提升点击率预测任务性能.通过在Criteo、Avazu两个公开数据集上的实验表明,该模型在点击率预测和特征稀疏问题上的性能表现均优于现有方法.
Abstract:The Click-through rate problem based on feature interaction modeling method has been widely explored and has made great progress. It can alleviate the loss of effective information, but to a certain extent, it depends on the co-occurrence of different features, and there is a problem of feature sparseness. Therefore, in order to solve the problem that the feature representation cannot be learned efficiently because of the few occurrences of the interactive process features, a click-through rate prediction model LGCDFM (LightGCN with DeepFM) based on LightGCN enhanced embedding layer learning is proposed. In the initial embedding layer, a divide-and-conquer learning strategy is adopted. Different types of nodes are distinguished in the graph structure. The information of the same types of nodes is first transmitted to ensure the frequency of features, and then the information of multi-hop neighbors is captured by the interaction between different types of high-order connected nodes. LightGCN structure has powerful feature extraction and representation learning capabilities, and it discards feature transformations and nonlinear activation functions that are not conducive to interaction. It becomes the advantage of collaborative filtering tasks for processing simple user-item interaction data, effectively reducing feature sparsity problem. Finally, it means that the learning layer applies the classic model of click-through rate prediction DeepFM to end-to-end learning high-order and low-order feature combinations, and learns from sparse data by latent vectors to improve the performance of click-through rate prediction tasks. Experiments on two public datasets of Criteo and Avazu show that the performance of this model is better than existing methods in terms of click-through rate prediction and feature sparseness problems.
-
Key words:
- CTR prediction /
- embedding learning /
- feature interaction /
- LightGCN
-
表 1 数据集介绍
Table 1. Dataset statistics
数据集 样本数量 特征域数量 特征数量 Criteo 45 840 617 39 998 960 Avazu 40 428 967 23 1 544 488 表 2 模型间性能对比
Table 2. Performance comparison of models
Criteo Avazu AUC RI-AUC Logloss RI-Logloss AUC RI-AUC Logloss RI-Logloss DeepFM 0.801 6 1.21% 0.449 8 1.91% 0.765 3 1.59% 0.385 4 0.83% PNN 0.798 3 1.63% 0.453 0 2.60% 0.765 8 1.53% 0.385 6 0.88% GIN 0.800 9 1.29% 0.451 7 2.32% 0.775 8 0.22% 0.382 9 0.18% FiGNN 0.806 2 0.63% 0.445 3 0.92% 0.776 2 0.17% 0.382 5 0.08% LGCDFM 0.811 3 0.00% 0.441 2 0.00% 0.777 5 0.00% 0.382 2 0.00% 表 3 Criteo数据集特征稀疏分析
Table 3. Feature sparsity analysis in Criteo
特征 频率 DeepFM (Logloss) LGCDFM (Logloss) F_1 12 0.365 8 0.232 8 F_2 4 0.321 2 0.3112 F_3 9 0.623 3 0.569 2 F_4 10 0.083 2 0.032 6 -
[1] Interactive Advertising Bureau. IAB internet advertising revenue report[R]. New York: IAB, 2021. [2] RENDLE S. Factorization machines[C]//2010 IEEE International Conference on Data Mining. Sydney, NSW, Australia: IEEE, 2010: 995-1000. DOI: 10.1109/ICDM.2010.127. [3] HE X N, CHUA T S. Neural factorization machines for sparse predictive analytics[C]//Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'17). New York, NY, USA: Association for Computing Machinery, 2017: 355-364. DOI: 10.1145/3077136.3080777. [4] CHENG H T, KOC L, HARMSEN J, et al. Wide & deep learning for recommender systems[C]//Proceedings of the 1st Workshop on Deep Learning for Recommender Systems. New York, NY, USA: Association for Computing Machinery, 2016: 7-10. DOI: 10.1145/2988450.2988454. [5] GUO H F, TANG R M, YE Y M, et al. DeepFM: A factorization-machine based neural network for CTR prediction[C]//Twenty-Sixth International Joint Conference on Artificial Intelligence. Melbourne, Australia: AAAI Press, 2017: 1725-1731. [6] ZHOU G R, WU K L, BIAN WJ, et al. Res-embedding for deep learning based click-through rate prediction modeling[C]//Proceedings of the 1st International Workshop on Deep Learning Practice for High-Dimensional SparseDate. New York, NY, USA: Association for Computing Machinery, 2019: 1-9. DOI: 10.1145/3326937.3341252. [7] ZHOU J, CUI G Q, HU S D, et al. Graph neural networks: a review of methods and applications[J]. AI Open, 2020, 1: 57-81. DOI: 10.1016/j.aiopen.2021.01.001. [8] FAN W Q, MA Y, LI Q, et al. Graph neural networks for social recommendation[C]//The World Wide Web Conference (WWW'19). New York, NY, USA: Association for Computing Machinery, 2019: 417-426. DOI: 10.1145/3308558.3313488. [9] LI Y J, TARLOW D, BROCKSCHMIDT M, et al. Gated graph sequence neural networks[J]. arXiv: 1511.05493, 2015. https://arxiv.org/abs/1511.05493v4. [10] KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[C]//ICLR 2017 Conference Submission. Toulon, France: ICLR, 2017. [11] HE X N, DENG K, WANG X, et al. LightGCN: simplifying and powering graph convolution network for recommendation[C]//Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, NY, USA: Association for Computing Machinery, 2020: 639-648. DOI: 10.1145/3397271.3401063. [12] NI Y B, OU D, LIU S C, et al. Perceive your users in depth: learning universal user representations from multiple E-commerce tasks[C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD'18). New York, NY, USA: Association for Computing Machinery, 2018: 596-605. DOI: 10.1145/3219819.3219828. [13] 王喆. 深度学习推荐系统[M]. 北京: 电子工业出版社, 2020.WANG Z. Deep learning recommender system[M]. Beijing: Publishing House of Electronics Industry, 2020. [14] WANG X, JI H Y, SHI C, et al. Heterogeneous graph attention network[C]//The World Wide Web Conference (WWW'19). New York, NY, USA: ACM, 2019: 2022-2032. DOI: 10.1145/3308558.3313562. [15] 郑诚, 黄夏炎. 联合轻量图卷积网络和注意力机制的推荐方法[J/OL]. 小型微型计算机, 2021: 1-6. http://kns.cnki.net/kcms/detail/21.1106.TP.20201231.1904.019.html.ZHENG C, HUANG X Y. A recommendation method combining light graph convolution network and attention[J/OL]. Journal of Chinese Computer System, 2021: 1-6. http://kns.cnki.net/kcms/detail/21.1106.TP.20201231.1904.019.html. [16] WANG X, HE X N, WANG M, et al. Neural graph collaborative filtering[C]//Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'19). New York, NY, USA: Association for Computing Machinery, 2019: 165-174. DOI: 10.1145/3331184.3331267. [17] RENDLE S, FREUDENTHALER C, GANTNERZ, et al. BPR: bayesian personalized ranking from implicit feedback[C]//Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI). Montreal, QC, Canada: AUAI Press, 2009: 452-461. DOI: 10.5555/1795114.1795167. [18] QU Y R, FANG B H, ZHANG WN, et al. Product-based neural networks for user response prediction over multi-field categorical data[J]. ACM Transactions on Information Systems, 2019, 37(1): 5. DOI: 10.1145/3233770. [19] LI F, CHEN Z R, WANG P J, et al. Graph intention network for click-through rate prediction in sponsored search[C]//Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, NY, USA: Association for Computing Machinery, 2019: 961-964. DOI: 10.1145/3331184.3331283. [20] LI Z K, CUI Z Y, WU S, et al. Fi-GNN: modeling feature interactions via graph neural networks for CTR prediction[C]//Proceedings of the 28th ACM International Conference on Information and Knowledge Management. New York, NY, USA: Association for Computing Machinery, 2019: 539-548. DOI: 10.1145/3357384.3357951. -