Individual Head Related Transfer Functions (HRTFs), necessary for the realistic audio rendering of a virtual scene, can be efficiently computed using numerical methods on three-dimensional scans of the ears and head of a subject. However, accurate geometries are required to obtain perceptually valid HRTFs. The most precise scanning techniques generally require specialized equipment, making them unpractical for a wide application. Photogrammetry, on the other hand, shows a great potential in terms of speed and affordability at the cost of a generally lower geometrical accuracy. The lack of precision of photogrammetric ear scans tends to significantly hinder the validity of the computed HRTFs. This is due to the complexity of the ear shape, generating a lack of visibility that negatively impacts the photogrammetric reconstruction algorithm, resulting in incomplete and noisy point clouds. This paper discusses the usage of deep neural networks for denoising and improving ear point clouds scanned through photogrammetry. Different neural network architectures are trained and tested on precise ear geometries to which artificial noise mimicking real photogrammetric error is applied. The results are compared to classical denoising approaches, and the validity of the HRTFs computed on the denoised scans is benchmarked against reference data.