Defining tumour volume for treatment response and radiotherapy planning is challenging and prone to inter- and intra-observer variability. Various automated tumour delineation methods have been proposed in the literature, each having abilities and limitations. Therefore, there is a need to provide clinicians with practical information on delineation method selection. Six different automated positron emission tomography (PET) delineation methods were evaluated and compared using National Electrical Manufacturer Association image quality (NEMA IQ) phantom data and three in-house synthetic phantoms with clinically relevant lesion shapes including spheres with necrotic core and irregular shapes. The impact of different contrast ratios, emission counts, realisations and reconstruction algorithms on delineation performance was also studied using similarity index (SI) and percentage volume error (%VE) as performance measures. With the NEMA IQ phantom, contrast thresholding (CT) performed best on average for all sphere sizes and parameter settings (SI = 0.83; %VE = 5.65% ± 24.34%). Adaptive thresholding at 40% (AT40) was the next best method and required no prior parameter tuning (SI = 0.78; %VE = 23.22% ± 70.83%). When using SUV harmonisation filtering prior to delineation (EQ.PET), AT40 remains the best method without prior parameter tuning (SI = 0.81; %VE = 11.39% ± 85.28%). For necrotic core spheres and irregular shapes of the synthetic phantoms, CT remained the best performing method (SI = 0.83; %VE = 26.31% ± 38.26% and SI = 0.62; %VE = 24.52% ± 46.89%, respectively). The second best method was fuzzy locally adaptive Bayesian (FLAB) (SI = 0.83; %VE = 29.51% ± 81.79%) for necrotic core sphere and AT40 (SI = 0.58; %VE = 25.11% ± 32.41%) for irregular shapes. When using EQ.PET prior to delineation, AT40 was the best performing method without prior parameter tuning for both necrotic core (SI = 0.83; %VE = 27.98% ± 59.58%) and complex shapes phantoms (SI = 0.61; %VE = 14.83% ± 49.39%). CT and AT40/AT50 are recommended for all lesion sizes and contrasts. Overall, considering background uptake information improves PET delineation accuracy. Applying EQ.PET prior to delineation improves accuracy and reduces coefficient of variation (CV) across different reconstructions and acquisitions.