Keynote Speaker:Prof. Jiawei Han (University of Illinois at Urbana-Champaign) Title of Keynote: Building Structured Information Networks from Massive Unstructured Data Abstract: We have been study mining of massive information networks. However, people may wonder where those networks come from. Except those easily built from structured (e.g., relational) data, such information network may not really exist and need to be constructed from massive real-world unstructured data, such as natural language text.In this talk, we introduce a set of methods developed recently in our group on constructing structured information networks from massive text data, including mining quality phrases, entity recognition and typing, multi-faceted taxonomy construction, and construction of heterogeneous information networks. We show that data-driven approach could be a promising direction at transforming massive text data into structured information networks, on which a lot of methods can be further developed for heterogeneous information network analysis. Biography: Jiawei Han is Michael Aiken Chair Professor, University of Illinois at Urbana-Champaign. He has been researching into data mining, information network analysis, database systems, and data warehousing, with over 900 journal and conference publications. He has chaired or served on many program committees of international conferences in most data mining and database conferences. He also served as the founding Editor-In-Chief of ACM Transactions on Knowledge Discovery from Data and the Director of Information Network Academic Research Center supported by U.S. Army Research Lab, and is the co-Director of KnowEnG, an NIH funded Center of Excellence in Big Data Computing (2014-2019) He is Fellow of ACM, Fellow of IEEE, and received 2004 ACM SIGKDD Innovations Award, 2005 IEEE Computer Society Technical Achievement Award, 2009 M. Wallace McDowell Award from IEEE Computer Society, and 2018 Japan’s Funai Achievement Award. His co-authored book "Data Mining: Concepts and Techniques" has been adopted as a textbook popularly worldwide.
Keynote Speaker:Prof. Nitesh Chawla (Universality of Notre Dame) Title of Keynote: Representation Learning on Heterogeneous Graph: From Shallow to Deep Embedding Abstract: Representation learning on grsphs is providing alternatives to feature engineering for designing feature vectors for the learning algorithms. The goal of representation learning is to embed nodes or (sub-)graphs by learning a mapping to a lower dimensional vector space. However, heterogeneous grapgs present their own set of challenges for representation learning given the multi-typed nodes and/or links. In addition, to the heterogeneity in node and link types, the content associated with the nodes presents yet another challenge. In this talk, I'll discuss our work on representationlearning in heterogeneous graphs. Biography: Nitesh Chawla is the Frank M. Freimann Professor of Computer Science and Engineering, and director of the Center for Network and Data Sciences at the University of Notre Dame. His research is focused on machine learning, data science, and network science.He has published over 200 papers receiving more than 25,000 citations and an h-index of 57. His papers have received several outstanding paper nominations and awards at top conferences and journals. He is the recipient of the 2015 IEEE CIS Outstanding Early Career Award; the IBM Watson Faculty Award, the IBM Big Data, and Analytics Faculty Award, National Academy of Engineering New Faculty Fellowship, and 1st Source Bank Technology Commercialization Award. In recognition of the societal and community impact of his research, he received the Rodney F Ganey Award and Michiana 40 Under 40 honor. He is also a two-time recipient of Outstanding Teaching Award at Notre Dame. He is a Fellow of the Initiative of Global Development; Fellow of the Kellogg Institute for International Studies; Fellow of the Riley Center for Science, Technology and Values; and Fellow of the Kroc Institute of Peace Studies. He is co-founder of Aunalytics, a data science software and solutions company.
Keynote Speaker:Dr. Hongxia Yang (Alibaba Group) Title of Keynote: Extremely Large Scale Graph Neural Network in Practice Abstract: An increasing number of machine learning tasks require dealing with large graph datasets, which capture rich and complex relationship among potentially billions of elements. Graph Neural Network (GNN) becomes an effective way to address the graph learning problem by converting the graph data into a low dimensional space while keeping both the structural and property information to the maximum extent and constructing a neural network for training and referencing. However, it is challenging to provide an efficient graph storage and computation capabilities to facilitate GNN training and enable the development of new GNN algorithms. In this paper, we present a comprehensive graph neural network system, namely AliGraph, which consists of distributed graph storage, optimized sampling operators and runtime to efficiently support not only existing popular GNNs but also a series of in-house developed ones for different scenarios. The system is currently deployed at Alibaba to support a variety of business scenarios, including product recommendation and personalized search at Alibaba’s E-Commerce platform. By conducting extensive experiments on a real-world dataset with 492.90 million vertices, 6.82 billion edges and rich attributes, Ali- Graph performs an order of magnitude faster in terms of graph building (5 minutes vs hours reported from the state-of-the-art PowerGraph platform). At the training, AliGraph runs 40%-50% faster with the novel caching strategy and demonstrates around 12 times speed up with the improved runtime. In addition, our in-house developed GNN models all showcase their statistically significant superiorities in terms of both effectiveness and efficiency (e.g., 4.12%–17.19% lift by F1 scores). Biography: Dr.Hongxia Yang is working as the Senior Staff Data Scientist and Director in Alibaba Group. Her interests span the areas of Bayesian statistics, time series analysis, spatial-temporal modeling, survival analysis, machine learning, data mining and their applications to problems in business analytics and big data. She used to work as the Principal Data Scientist at Yahoo! Inc and Research Staff Member at IBM T.J. Watson Research Center respectively and got her PhD degree in Statistics from Duke University in 2010. She has published close to 50 top conference and journal papers, held 9 filed US patents and is serving as the associate editor for Applied Stochastic Models in Business and Industry. She has been elected as Elected Members of the International Statistical Institute (ISI) in 2017 and the Chinese Institute of Electronics Young Scientist Club in 2019 respectively.
Keynote Speaker:Prof. Yun Xiong (Fudan University) Title of Keynote:Applications of Heterogeneous Information Networks Abstract: Heterogeneous information networks (HINs) has attracted considerable attention in the last decade. Most research works mainly focus on taking DBLP and social network as research data sets. In this talk, I will share with the audience three applications of HINs. In the first application, using fund-raising results prediction in crowd-funding (P2P lending) as an example, I will introduce the collective evolution inference problem and solution in HINs. Secondly, a complex behavioral data embedding method will be illustrated in stock exchange HINs for market manipulation identification. In the third application, I will show how the heterogeneous network embedding enables accurate disease association predictions. Biography: Yun Xiong received the PhD degree in computer and software theory from Fudan University in 2008. She is a professor of computer science at Fudan University and the deputy director of Shanghai Key Laboratory of Data Science. Her research interests include data science and big data mining. She has published over 80 research papers in international journals and major peer-reviewed conference proceedings. She has served on many program committees of international conferences including IEEE ICDM, AAAI, CIKM, etc. Her co-authored book “Dataology and Data Science” published in 2009 is recognized as the first monograph in data science.
Keynote Speaker:Prof. Philip S. Yu Title of Keynote: Broad Learning on Big Data: A Heterogeneous Information Network Approach Abstract: In the era of big data, there are abundant of data available across many different data sources in various formats. “Broad Learning” is a new type of learning task, which focuses on fusing multiple large-scale information sources of diverse varieties together and carrying out synergistic data mining tasks across these fused sources in one unified analytic. Great challenges exist on “Broad Learning” for the effective fusion of relevant knowledge across different data sources, which depend upon not only the relatedness of these data sources, but also the target application problem. In this talk we examine how to fuse multiple data sources using heterogeneous information network models to improve mining effectiveness over various applications, including social network, and recommendation. Biography: Dr. Philip S. Yu is a Distinguished Professor and the Wexler Chair in Information Technology at the Department of Computer Science, University of Illinois at Chicago. Before joining UIC, he was at the IBM Watson Research Center, where he built a world-renowned data mining and database department. He is a Fellow of the ACM and IEEE. Dr. Yu is the recipient of ACM SIGKDD 2016 Innovation Award for his influential research and scientific contributions on mining, fusion and anonymization of big data, the IEEE Computer Society’s 2013 Technical Achievement Award for “pioneering and fundamentally innovative contributions to the scalable indexing, querying, searching, mining and anonymization of big data” and the Research Contributions Award from IEEE Intl. Conference on Data Mining (ICDM) in 2003 for his pioneering contributions to the field of data mining. Dr. Yu has published more than 1,200 referred conference and journal papers cited more than 110,000 times with an H-index of 155. He has applied for more than 300 patents. Dr. Yu was the Editor-in-Chiefs of ACM Transactions on Knowledge Discovery from Data (2011-2017) and IEEE Transactions on Knowledge and Data Engineering (2001-2004).