This paper explores the integration of load balancing features in the data parallel language HPF targeting semi-regular applications. We show that the HPF virtual processors are good candidates to be the unit of migration. Then, we compare 3 possible implementations and show that threads provide a good trade-off between efficiency and ease of implementation. We finally describe a preliminary implementation. The experimental results, obtained with the Gaussian elimination with partial pivoting are promising.