Existing partitioning algorithms provide limited support for load balancing simulations that are performed on heterogeneous parallel computing platforms. On such architectures, effective load balancing can only be achieved if the graph is distributed so that it properly takes into account the available resources (CPU speed, network bandwidth). We developed a graph partitioning algorithm that can address the partitioning requirements of the scientific computations, and can correctly model the architectural characteristics of emerging hardware platforms.