Description of the job
This is a remote position.
We are seeking a skilled and motivated AWS Platform Engineer to join our team and contribute to the development and maintenance of scalable data solutions. The ideal candidate will have hands-on experience with EMR and Spark (PySpark), a strong understanding of data mesh principles, and a passion for building robust data pipelines and quality frameworks.
Location: Remote
Status: Contract
Responsibilities:
- Design, develop, and optimize data pipelines using EMR and Spark (PySpark)
- Implement and manage AWS Lake Formation for secure and governed data access
- Contribute to the development of data mesh solutions, enabling domain-oriented data ownership and interoperability
- Write and maintain data quality checks to ensure accuracy, completeness, and reliability of data
- Establish a standardized framework to promote consistency and ensure alignment with architectural standards
- Support light DevOps tasks, with a strong preference for experience using Terraform for infrastructure as code
Requirements
- Proven experience with Apache Spark and AWS EMR
- Familiarity with Lake Formation and AWS data lake architecture
- Exposure to data mesh concepts and implementation
- Experience writing data validation and quality checks
- Experience defining best-practices framework to ensure consistency and compliance with architectural standard
- Experience with Terraform or similar tools for infrastructure automation