Senior Data Engineer - Python / Data Pipelines / Data Platform / AWS - is required by fast growing, highly successful and tech focused organisation.

About the job

You will play a crucial role in designing, building, and maintaining their data platform, with a strong emphasis on streaming data, cloud infrastructure, and machine learning operations.

Key Responsibilities:

Architect and Implement Data Pipelines:
- Design, develop, and maintain scalable and efficient data pipelines
- Optimize ETL processes to ensure seamless data ingestion, processing, and integration across various systems
Streaming Data Platform Development:
- Lead the development and maintenance of a real-time data streaming platform using tools like Apache Kafka, Databricks, Kinesis.
- Ensure the integration of streaming data with batch processing systems for comprehensive data management
Cloud Infrastructure Management:
- Utilize AWS data engineering services (including S3, Redshift, Glue, Kinesis, Lambda, etc.) to build and manage our data infrastructure
- Continuously optimize the platform for performance, scalability, and cost-effectiveness
Communications:
- Collaborate with cross-functional teams, including data scientists and BI developers, to understand data needs and deliver solutions
- Leverage the project management team to coordinate project, requirements, timelines and deliverables, allowing you to concentrate on technical excellence
ML Ops and Advanced Data Engineering:
- Establish ML Ops practices within the data engineering framework, focusing on automation, monitoring, and optimization of machine learning pipelines
Data Quality and Governance:
- Implement and maintain data quality frameworks, ensuring the accuracy, consistency, and reliability of data across the platform
- Drive data governance initiatives, including data cataloguing, lineage tracking, and adherence to security and compliance standards

Requirements

Experience:

3+ years of experience in data engineering, with a proven track record in building and maintaining data platforms, preferably on AWS
Strong proficiency in Python, experience in SQL and PostgreSQL. PySpark, Scala or Java is a plus
Familiarity with Databricks and the Delta Lakehouse concept
Experience mentoring or leading junior engineers is highly desirable

Skills:

Deep understanding of cloud-based data architectures and best practices
Proficiency in designing, implementing, and optimizing ETL/ELT workflows
Strong database and data lake management skills
Familiarity with ML Ops practices and tools, with a desire to expand skills in this area
Excellent problem-solving abilities and a collaborative mindset

Nice to Have:

Familiarity with containerization and orchestration tools (e.g., Docker, Kubernetes)
Knowledge of machine learning pipelines and their integration with data platforms

Great training and career development opportunities exist for the right candidate.

Basic salary 60-65,000 + excellent benefits

Office based in Northumberland. Fully remote working available

Senior Data Engineer

Junior Service Desk Analyst

Customer Services Representive

Mathematics & Statistics Graduate Programme (Data Analysis)