System level design is increasingly turning towards FPGAs to take advantage of their low cost and fast prototyping. In this paper we present a timing driven partitioning approach for an architecturally constrained multi-FPGA system. The partitioning approach uses path-based clustering based on the work by \cite{alpert} and retiming \cite{saxe}. The board-level architecture is based on the PCB model consisting of four Xilinx 4013 FPGAs. The proposed algorithm has been tested on large scale real designs.