X. Ning and G. Karypis, “Sparse linear models with side-information
for top-n recommender systems,” in Proceeding of WWW2012, 2012.
[ bib |
This paper focuses on developing effctive algorithms that utilize side information for top-N recommender systems. A set of Sparse Linear Methods with Side information (SSLIM) is proposed, that utilize a regularized optimization process to learn a sparse item-to-item coefficient matrix based on historical user-item purchase profiles and side information associated with the items. This coefficient matrix is used within an item-based recommendation framework to generate a size-N ranked list of items for a user. Our experimental results demonstrate that SSLIM outperforms other methods in effectively utilizing side information and achieving performance improvement.
|||X. Ning and G. Karypis, “Sparse linear models with side-information for top-n recommender systems,” in Proceeding of RecSys2012, 2012. [ bib ]|
X. Ning and G. Karypis, “Slim: Sparse linear models for top-n
recommender systems,” in IEEE International Conference on Data Mining,
acceptance rate: 12%.
[ bib |
This paper focuses on developing effective and efficient algorithms for top-N recommender systems. A novel Sparse LInear Method (SLIM) is proposed, which generates top-N recommendations by aggregating from user purchase/rating profiles. A sparse aggregation coefficient matrix W is learned from SLIM by solving an L1-norm and L2-norm regularized optimization problem. W is demonstrated to produce high-quality recommendations and its sparsity allows SLIM to generate recommendations very fast. A comprehensive set of experiments is conducted by comparing the SLIM method and other state-of-the-art top-N recommendation methods. The experiments show that SLIM achieves significant improvements both in run time performance and recommendation quality over the best existing methods.
Keywords: Top-N Recommender Systems, Sparse Linear Methods, L1-norm Regularization
X. Ning and Y. Qi, “Semi-supervised convolution graph kernels for
relation extraction,” in SIAM International Conference on Data Mining,
acceptance rate: 25%.
[ bib |
Extracting semantic relations between entities is an important step towards automatic text understanding. In this paper, we propose a novel Semi-supervised Convolution Graph Kernel (SCGK) method for semantic Relation Extraction (RE) from natural English text. By encoding sentences as dependency graphs of words, SCGK computes kernels (similarities) between sentences using a convolution strategy, i.e., calculating similarities over all possible short single paths on two dependency graphs. Furthermore, SCGK adds three semi-supervised strategies in the kernel calculation to enable soft-matching between (1) words, (2) grammatical dependencies, and (3) entire sentences, respectively. From a large unannotated corpus, these semi-supervision steps learn to capture contextual semantic patterns of elements inside natural sentences, and therefore alleviate the lack of annotated examples in most RE corpora. Through convolutions and multi-level semi-supervisions, SCGK provides a powerful model to encode both syntactic and semantic evidence which are important for effectively recovering the relational patterns of interest. We perform extensive experiments on five RE benchmark datasets which aim to identify interaction relationships from biomedical literature. Our results demonstrate that SCGK achieves the state-of-the-art performance on the task of semantic relation extraction.
Keywords: Relation Extraction, Graph Kernels, Semisupervised Learning, Natural Language Processing
M. W. X. Ning and G. Karypis, “Improved machine learning models for
predicting selective compounds,” in ACM Conference on Bioinformatics,
Computational Biology and Biomedicine, 2011.
acceptance rate: 19%.
[ bib |
The identification of small potent compounds that selectively bind to the target under consideration with high affinities is a critical step towards successful drug discovery. However, there still lacks efficient and accurate computational methods to predict compound selectivity properties. In this paper, we propose a set of machine learning methods to do compound selectivity prediction. In particular, we propose a novel cascaded learning method and a multi-task learning method. The cascaded method decomposes the selectivity prediction into two steps, one model for each step, so as to effectively filter out non-selective compounds. The multi-task method incorporates both activity and selectivity models into one multi-task model so as to better differentiate compound selectivity properties. We conducted a comprehensive set of experiments and compared the results with other conventional selectivity prediction methods, and our results demonstrated that the cascaded and multi-task methods significantly improve the selectivity prediction performance.
X. Ning and G. Karypis, “Multi-task learning for recommender
systems,” in Journal of Machine Learning Research Workshop and
Conference Proceedings (ACML2010), vol. 13, pp. 269-284, Microtome
acceptance rate: 30%.
[ bib |
This paper focuses on exploring personalized multi-task learning approaches for collaborative filtering towards the goal of improving the prediction performance of rating prediction systems. These methods first specifically identify a set of users that are closely related to the user under consideration (i.e., active user), and then learn multiple rating prediction models simultaneously, one for the active user and one for each of the related users. Such learning for multiple models (tasks) in parallel is implemented by representing all learning instances (users and items) using a coupled user-item representation, and within error-insensitive Support Vector Regression (e-SVR) framework applying multi-task kernel tricks. A comprehensive set of experiments shows that multi-task learning approaches lead to significant performance improvement over conventional alternatives.
Keywords: Collaborative Filtering, Multi-Task Learning
P. Kuksa, Y. Qi, B. Bai, R. Collobert, J. Weston, V. Pavlovic, and X.
Ning, “Semi-supervised abstraction-augmented string kernel for multi-level
bio-relation extraction,” in Proceedings of the 2010 European
conference on Machine learning and knowledge discovery in databases: Part
II, (Berlin, Heidelberg), pp. 128-144, Springer-Verlag, 2010.
acceptance rate: 17%.
[ bib |
Bio-relation extraction (bRE), an important goal in bio-text mining, involves subtasks identifying relationships between bio-entities in text at multiple levels, e.g., at the article, sentence or relation level.A key limitation of current bRE systems is that they are restricted by the availability of annotated corpora. In this work we introduce a semi-supervised approach that can tackle multi-level bRE via string compar-isons with mismatches in the string kernel framework. Our string kernel implements an abstraction step, which groups similar words to gener-ate more abstract entities, which can be learnt with unlabeled data.Speci¯cally, two unsupervised models are proposed to capture contex-tual (local or global) semantic similarities between words from a large unannotated corpus. This Abstraction-augmented String Kernel (ASK) allows for better generalization of patterns learned from annotated data and provides a uniffied framework for solving bRE with multiple degrees of detail. ASK shows effective improvements over classic string kernels on four datasets and achieves state-of-the-art bRE performance without the need for complex linguistic features.
Keywords: learning with auxiliary information, relation extraction, semi-supervised string kernel, sequence classification
X. Ning and G. Karypis, “The set classification problem and solution
methods,” in SIAM International Conference on Data Mining,
pp. 847-858, SIAM, 2009.
acceptance rate: 16%.
[ bib |
This paper focuses on developing classification algorithms for problems in which there is a need to predict the class based on multiple observations (examples) of the same phenomenon (class). These problems give rise to a new classification problem, referred to as set classification, that requires the prediction of a set of instances given the prior knowledge that all the instances of the set belong to the same unknown class. This problem falls under the general class of problems whose instances have class label dependencies. Four methods for solving the set classification problem are developed and studied. The first is based on a straightforward extension of the traditional classification paradigm whereas the other three are designed to explicitly take into account the known dependencies among the instances of the unlabeled set during learning or classification. A comprehensive experimental evaluation of the various methods and their underlying parameters shows that some of them lead to significant gains in performance.
J. Chen, Y. Zheng, and X. Ning, “Scalable parallel quadrilateral mesh
generation coupled with mesh partitioning,” International Conference on
Parallel and Distributed Computing Applications and Technologies,
pp. 966-970, 2005.
acceptance rate: 25%.
[ bib |
In this paper, we present our efforts to parallelize an unstructured quadrilateral mesh generator. Its serial version is based on the divider-and-conquer idea, and mainly includes two stages, i.e. geometry decomposition and mesh generation. Both stages are parallelized separately. A highly efficient fine-grain level parallel scheme is presented to parallelize the stage of geometry decomposition. A SubDomain Graph (SDG), which represents the connections of subdomains, is constructed. The task of parallel mesh generation is then reduced to that of the SDG partitioning. Since the number of elements in subdomains could be pre-computed before meshing, a static load balancing scheme to partition the SDG performs well with the aid of Metis tools. Numerical results show that scalable timing performance could be achieved by using the parallel mesh generator with resulting meshes nicely partitioned among processors, which enables a fast parallel simulation environment by eliminating the traditional I/O-busy process of mesh repartitioning.
This file was generated by bibtex2html 1.97.