Abstract : In (18)F-FDG PET, tumors are often characterized by their metabolically active volume and standardized uptake value (SUV). However, many approaches have been proposed to estimate tumor volume and SUV from (18)F-FDG PET images, none of them being widely agreed upon. We assessed the accuracy and robustness of 5 methods for tumor volume estimates and of 10 methods for SUV estimates in a large variety of configurations. METHODS: PET acquisitions of an anthropomorphic phantom containing 17 spheres (volumes between 0.43 and 97 mL, sphere-to-surrounding-activity concentration ratios between 2 and 68) were used. Forty-one nonspheric tumors (volumes between 0.6 and 92 mL, SUV of 2, 4, and 8) were also simulated and inserted in a real patient (18)F-FDG PET scan. Four threshold-based methods (including one, T(bgd), accounting for background activity) and a model-based method (Fit) described in the literature were used for tumor volume measurements. The mean SUV in the resulting volumes were calculated, without and with partial-volume effect (PVE) correction, as well as the maximum SUV (SUV(max)). The parameters involved in the tumor segmentation and SUV estimation methods were optimized using 3 approaches, corresponding to getting the best of each method or testing each method in more realistic situations in which the parameters cannot be perfectly optimized. RESULTS: In the phantom and simulated data, the T(bgd) and Fit methods yielded the most accurate volume estimates, with mean errors of 2% +/- 11% and -8% +/- 21% in the most realistic situations. Considering the simulated data, all SUV not corrected for PVE had a mean bias between -31% and -46%, much larger than the bias observed with SUV(max) (-11% +/- 23%) or with the PVE-corrected SUV based on T(bgd) and Fit (-2% +/- 10% and 3% +/- 24%). CONCLUSION: The method used to estimate tumor volume and SUV greatly affects the reliability of the estimates. The T(bgd) and Fit methods yielded low errors in volume estimates in a broad range of situations. The PVE-corrected SUV based on T(bgd) and Fit were more accurate and reproducible than SUV(max).