In this paper, we present our networked virtual tennis game that has been developed as a hybrid framework with head-mounted display and fishtank virtual reality systems. The paper reports the findings of a hybrid collaboration task which compared the two systems based on their egocentric and exocentric features. The focus of the study was on the strengths and weaknesses in each system for the given particular task: How do users perform in each system? And how might each system complement the others for teamwork? We report on localization error and correct hit percentage results that were obtained during trials with the two types of systems. These results suggest that head-mounted displays with egocentric features allow more accurate spatial localization and the fishtank displays with exocentric features provide better cues for time synchronization events.