The purpose of this work is to build a set of tools for solving large sparse linear systems on the distributed-memory computers. These tools constitute P-SPARSLIB, a portable library of sparse iterative solvers intended to be used in a distributed-memory environment. The library consists of the four major parts, the accelerators, preprocessing tools, preconditioning routines, and message-passing tools. The accelerators together with the preconditioners constitute the functional layer of the library. The accelerators are based on the Krylov subspace methods. These methods, whether for standard or distributed matrices, often work poorly without preconditioning. The preconditioners provided with the library encompass a number of `standard' options for preconditioning distributed sparse matrices, such as overlapping block Jacobi (overlapping additive Schwarz), multicolor block SOR (overlapping multicolor multiplicative Schwarz), distributed ILU(0), approximate inverse preconditioners, etc. In order to make the library as flexible as possible, the functional modules were designed to be independent from date structures and specifics of message-passing. We have employed a `reverse communication mechanism' to bypass the need of particular data structures in the functional routines. Whenever a matrix-by-vector product or a preconditioning operation is needed, a functional routine returns to the calling program to let it perform the desired operation. Then the calling program should call this functional routine again to continue the iterative process. A bulk of the code was intended to be machine-independent. We have isolated the functions that are the most critical for obtaining good performance in order to adjust them to a particular computer architecture and message-passing paradigm. We have not embraced any of the proposed message-passing standards such as PVM or MPI because of their current unability to deliver a satisfactory performance for many architectures. Instead, we have isolated the communications routines in order to be able to port them with minor efforts and to utilize the native libraries and other vendor-supplied software and hardware solutions. Whenever possible, the communications routines exploit redundancy of communications and asynchronous message-passing capabilities to reduce latency and allow an overlap between computations and communications. We have assembled the communications routines in a toolkit which together with the basis sparse linear algebra routines (BLAS-1) serves as the ground level of the library. Currently the message-passing toolkit implemented, uses mainly MPI. The package has been tested on the CM5, CRAY-T3D, CRAY T3E, Convex Exemplar, IBM SP2, IBM an d SGI workstation clusters.