The Challenge on Liver Ultrasound Tracking (CLUST) was held in conjunction with the MICCAI 2014 conference to enable direct comparison of tracking methods for this application. This paper reports the outcome of this challenge, including setup, methods, results and experiences. The database included 54 2D and 3D sequences of the liver of healthy volunteers and tumor patients under free breathing. Participants had to provide the tracking results of 90% of the data (test set) for pre-defined point-landmarks (healthy volunteers) or for tumor segmentations (patient data). In this paper we compare the best six methods which participated in the challenge. Quantitative evaluation was performed by the organizers with respect to manual annotations. Results of all methods showed a mean tracking error ranging between 1.4 mm and 2.1 mm for 2D points, and between 2.6 mm and 4.6 mm for 3D points. Fusing all automatic results by considering the median tracking results, improved the mean error to 1.2 mm (2D) and 2.5 mm (3D). For all methods, the performance is still not comparable to human inter-rater variability, with a mean tracking error of 0.5–0.6 mm (2D) and 1.2–1.8 mm (3D). The segmentation task was fulfilled only by one participant, resulting in a Dice coefficient ranging from 76.7% to 92.3%. The CLUST database continues to be available and the online leader-board will be updated as an ongoing challenge.