Optimizing Apache Spark for Data Analytics at Audi
New functions and features in the automotive domain result in a rapid increase of generated data. Especially during development it is important to be able to evaluate the measurements of sensors and control units effectively. To supersede outdated administration Audi works on a project called Share42morrow. This project should provide a central plattform for sharing, administrating and analyzing data over many different departments. For data utilization the Apache Spark Framework is used. Apache Spark realizes a simple development model for parallel applications. However it also involves the danger of impractical handling regarding configuration, data deposition and application development. Therefore, this work presents a comprehensive view on possible optimizations for Apache Spark applications. A special focus is on the optimization regarding the given specifications and restrictions of the cluster. In addtion three implementations for enhancing the cluster utilization are depicted. Those or similar steps are highly recommended for improving the performance of the cluster as well as conserving resources.
Ort: Raum 04.137, Martensstr. 3, Erlangen