SLIM is a library which implements the Sparse LInear Methods (SLIM) for top-n recommendation. The algorithm is described in the paper

Xia Ning and George Karypis, "SLIM: Sparse Linear Models for Top-N Recommender Systems", Proceedings of the 2011 IEEE 11th International Conference on Data Mining, 497–506.

This manual is divided in the following sections:

Download

SLIM is an open-source software and also provided as a binary distribution with pre-built executables for Linux (64 bit architecture). Additional binaries can be provided upon request. The source code can be downloaded here.

slim-1.0.tar.gz Linux (x86_64)

A pdf version of the manual is available here

Installation

Once you download SLIM, you need to uncompress and untar it using the following commands:

  > tar -xzf slim-1.0.tar.gz

This will create a directory named slim-1.0 with the following structure:

  slim-1.0\
    build\
    examples\
    include\
    src\

In order to compile the source code and build the SLIM library, it requires CMake 2.8 (http://www.cmake.org/) and gcc 4.4. It also requires gsl (GNU Scientific Library) installed. Assumming CMake, gcc and gsl are installed, do the following commands to compile and build:

  > cd slim-1.0
  > cd build
  > cmake ..
  > make
  > make install

And if you want to clean all the objects generated from make, do the following command:

  > make clean

After you do the above commands, a libSLIM.a library will be generated within build/lib directory, all the *.h files are in build/include directory, and two executables slim_learn and slim_predict will be generated within build/examples directory. You can use slim_learn and slim_predict as stand-alone programs, or you can use the library by properly linking it and including the header files.

Running SLIM

The name of the SLIM executable is slim_learn and slim_predict and they are located under build/examples. The slim_learn and slim_predict programs are invoked at the command-line within a shell window (e.g., Gnome terminal, etc).

Manpage

The manpage for SLIM is the following (can be obtained by typing slim_learn -help):

Usage
slim_learn [options]

-train_file=string
Specifies the input file which contains the training data. This file should be
in .csr format.

-test_file=string
Specifies the input file which contains the testing data. This file should be
in .csr format.

-model_file=string
Specifies the output file which will contains a model matrix. The output file will be in
.csr format.

-fs_file=string
Specifies the input file which contains a matrix for feature selection purpose. This input
file should be in .csr format. This option takes effect only when -fs option is specified.

-pred_file=string
Specifies the output file which will contain the top-n prediction for each user. The output
file wil be in .csr format. If this option is not specified, no prediction scores will be output.

-lambda=float
Specifies the regularization parameter for the $\ell_1$ norm

-beta=flat
Specifies the regularizationi parameter for the $\ell_2$ norm

-starti=int
Specifies the index of the first column (C-style indexing) from
which the sparse coefficient matrix will be calculated. The default
value is 0.

-endi=int
Specifies the index of the last column (exclusively) up to which
the sparse coefficient matrix will be calculated. The default value
is the number of total columns.

-transpose
Specifies that the input feature selection matrix needs to be transposed.

-fs
Specifies that feature selection is required so as to accelerate the learning.

-k=int
Specifies the number of features if feature selection is applied. The default
value is 50.

-dbglvl=int
Specifies the debug level. The default value is 0.

-optTol=float
Specifies the threshold which control the optimization. Once the error
from two optimization iterations is smaller than this value, the optimization
process will be terminated. The default value is 1e-5.

-max_bcls_niters=int
Specifies the maximum number of iterations that is allowed for optimization.
Once the number of iterations reaches this value, the optimization process
will be terminated. The default value is 1e5.

-bsize=int
Specifies the block size for output. Once the calculation for these bsize
blocks are done, they are dumped into the output file. The default value is 1000.

-nratings=int
Specifies the number of unique rating values in the testing set. The rating values
should be integers starting from 1. The default value is 1.

-topn=int
Specifies the number of recommendations to be produced for each user. The default
value is 10.

-help
Print this message.

Input Files

The slim_learn and slim_predict accept and produce a sparse matrix format (with extension .csr) which is specified as follows.

A sparse matrix A with n rows and m columns is stored in a plain text file that contains n lines, where the n lines contain information for each row of A. In SLIM’s sparse matrix format only the non-zero entries of the matrix are stored. In particular, the i-st line of the file contains information about the non-zero entries of the i-th row of the matrix. The non-zero entries of each row are specified as a space-separated list of pairs. Each pair contains the column number followed by the value for that particular column. The column numbers are assumed to be integers and their corresponding values are assumed to be binary. Note that the columns are numbered starting from 1 (not from 0 as is often done in C). An example of SLIM’s matrix format is shown as follows. This shows an example 7 × 8 matrix and its corresponding representation in SLIM’s matrix format.

matrix:

0 1 0 0 1 0 0 1
1 1 0 1 0 0 0 0
0 0 1 0 0 1 0 1
1 0 0 0 0 0 0 0
0 1 0 1 0 0 1 0
0 0 1 0 1 1 0 0 
0 1 0 1 1 0 1 1

matrix .csr file
2 1 5 1 8 1
1 1 2 1 4 1
3 1 6 1 8 1
1 1
2 1 4 1 7 1
3 1 5 1 6 1
2 1 4 1 5 1 7 1 8 1

Output Files

The slim_learn generates a model file which will be in .csr format as specified above, and the contained matrix is actually the transpose of the aggregation coefficient matrix.

The slim_predict generates a prediction file, if specified by -pred_file, in .csr format. In this file, each row corresponds to a testing user, the column values correspond to the items that have bean recommended, and the corresponding values are the recommendation scores. All the column values are order based on the scores in descreasing order.

Examples

The following shows how to run slim_learn

slim_learn -train_file=train.mat -model_file=model.mat -starti=0 -endi=1682 -lambda=2 -beta=5 -optTol=0.00001 -max_bcls_niters=10000

The model is printed into model.mat.

Note: The matrix output to model.mat is the transpose of the sparse aggregation coefficient matrix.

The following shows how to run slim_predict

slim_predict -train_file=train.mat -test_file=test.mat -model_file=model.mat -pred_file=prediction.txt -topn=10

Note: If model.mat contains an aggregation coefficient matrix or an item-item similarity matrix (i.e., not the transposed one) computated from another method, it needs to be transposed,

Running SLIM in parallel

You can run slim_learn to calculate only a chunck (i.e., a certain set of consecutive columns, specified by -starti and -endi) of the aggregation coefficent matrix. In this way, you can run multiple slim_learn programs in parallel (e.g., on a hadoop cluster) to calculate different chunks of the aggreegation coefficient matrix concurrently and then collect all the output and concatenate them in the right order so as to get the entire aggregation coefficient matrix.

Credits & Contact Information

SLIM was written by Xia Ning.

Thank Prof. Michael P. Friedlander for providing the BCLS library.

Thank Prof. George Karypis for providing the GKlib library.

If you encounter any problems or have any suggestions, please contact Xia Ning via email at xning@cs.umn.edu.

Copyright Information

Copyright and License Notice
----------------------------

The SLIM package is copyrighted by the Regents of the University of Minnesota. 
It can be freely used for educational and research purposes by non-profit 
institutions and US government agencies only. Other organizations are allowed 
to use SLIM only for evaluation purposes, and any further uses will require 
prior approval. The software may not be sold or redistributed without prior 
approval. One may make copies of the software for their use provided that the 
copies, are not sold or distributed, are used under the same terms and 
conditions. 

As unestablished research software, this code is provided on an ``as is'' basis
without warranty of any kind, either expressed or implied. The downloading, or
executing any part of this software constitutes an implicit agreement to these 
terms. These terms and conditions are subject to change at any time without 
prior notice.