Requirements:
- 7+ years of experience in Amazon Web Services (AWS) cloud computing.
- 10+ years of experience in big data and distributed computing.
- Strong hands-on experience with PySpark, Apache Spark, and Python.
- Strong hands-on experience with SQL and NoSQL databases (DB2, PostgreSQL, Snowflake, etc.).
- Proficiency in data modeling and ETL workflows.
- Proficiency with workflow schedulers like Airflow.
- Hands-on experience with AWS cloud-based data platforms.
- Experience in DevOps, CI/CD pipelines, and containerization (Docker, Kubernetes) is a plus.
- Strong problem-solving skills and ability to lead a team.
- Experience with DBT and AWS Astronomer is a plus.
- Lead the design, development, and deployment of PySpark-based big data solutions.
- rchitect and optimize ETL pipelines for structured and unstructured data.
- Collaborate with clients, data engineers, data scientists, and business teams to provide scalable solutions.
- Optimize Spark performance through partitioning, caching, and tuning.
- Implement best practices in data engineering (CI/CD, version control, unit testing).
- Work with cloud platforms like AWS.
- Ensure data security, governance, and compliance.
- Mentor junior developers and review code for best practices and efficiency.
- 7+ years of experience in AWS cloud computing.
- 10+ years of experience in big data and distributed computing.
- Experience with PySpark, Apache Spark, and Python.
- Experience with SQL and NoSQL databases (DB2, PostgreSQL, Snowflake, etc.).
- Hands-on experience with AWS cloud-based data platforms.
- Experience in DevOps, CI/CD pipelines, and containerization (Docker, Kubernetes) is a plus.