Position Summary
We are seeking a highly skilled and experienced Senior Data Engineer with a consulting background to join our team. In this role, you will design, build, and optimise scalable data solutions that drive business insights for our clients. You will work with modern cloud technologies, specifically leveraging Microsoft Azure products, and apply your expertise in SQL, Python, Spark, and Terraform. The ideal candidate has a deep understanding of lakehouse and medallion architecture, data modelling, creating and configuring ETL/ELT pipelines, and securing networks for private connectivity.
This role requires a hands-on engineer who can lead data projects, consult with stakeholders, and deliver high-performance data platforms that meet security, compliance, and business requirements.
The SeniorData Engineer has responsibility for the following:
Consulting & Stakeholder Collaboration: Work closely with clients and internal teams to gather requirements in technical and pre-sales engagements, define project scope, and deliver data solutions that align with business objectives. You may also be required to lead or support the delivery of technical workshops to customers.
Data Pipeline Development & Optimization: Design, implement, and optimise robust ETL/ELT workflows to ingest, transform, and process large volumes of data, utilizing tools like Azure Data Factory, Azure Synapse Analytics, and Databricks.
Cloud Platform Expertise: Architect, deploy, and manage data solutions on Azure, including Azure Data Lake, Azure Databricks, Azure Synapse, and Azure Log Analytics.
- Data Modelling & Architecture: Implement scalable data models and architectures, with a strong focus on lakehouse and medallion architectures to enable efficient and structured data analysis.
- Efficient Data Storage & Lifecycle Management: Implement and manage data storage solutions that ensure efficient use of resources, incorporating data lifecycle management strategies to optimise storage costs, archive unused data, and comply with data retention policies.
- Job Orchestration: Configure and manage Databricks jobs, ensuring efficient execution and optimization of data workflows.
- CI/CD & DevOps: Lead efforts to automate data pipeline deployments using Azure DevOps, CI/CD pipelines, and Terraform for infrastructure-as-code.
- Networking: Design and configure secure networking for data solutions, ensuring data protection, compliance, and best practices across the stack.
- Performance Tuning: Monitor and optimise system performance, ensuring high availability and efficient processing times for all data platforms.
- Cross-functional Collaboration: Work closely with other team members, such as data analysts, analytics engineers, and FinOps analysts, to ensure the delivery of high-quality, reliable, and scalable data products, facilitating seamless access to clean and well-modelled data for reporting and advanced analytics.
- Documentation & Best Practices: Ensure all solutions are well-documented, and actively contribute to the definition of data engineering best practices, coding standards, and processes.
Behavioural competencies - Organisational and Behavioural Fit
- You have a positive mindset: you're excited by unfamiliar challenges and learning new things
- You’re an enthusiastic self-starter, eager to learn a wide variety of technologies and apply them to real-world customer problems
- You are comfortable and experienced in talking to and working with senior stakeholders
- You keep up to date on new technologies and trends
- A methodical and logical approach to problems
- An understanding of the ethics of gathering and working with data
Critical competencies - Technical Fit
- Extensive experience with Power BI, with understanding of workspace administration, capacity management, and licensing
- Strong knowledge of SQL and data warehouse and lakehouse design methodologies (i.e. STAR schema approach and medallion architecture); and experience applying this to real-world data modelling scenarios by developing raw data to create databases, tables, and views for analytics
- Background working with programming (such as Python and/or R) and scripting languages (e.g. PowerShell) for statistical analysis, automation, and working with APIs
- Comfortable accessing, transforming, analysing, and modelling data using big data frameworks such as Spark with e.g. PySpark or Sparklyr
- Capable of researching and developing machine learning models and approaches for deploying models into production for real-world applications
- Understanding of a variety of approaches to data platforms (e.g. Data Lake, Data Mesh, Data Warehouse, streaming, batch processing)
- Knowledge of security, GDPR and PII data handling principles
Experience with the following technologies is desirable:
- 5+ years of experience in data engineering or a related role, with consulting experience preferred.
- Expertise in SQL, Python, and Spark for data processing and analysis.
- Strong experience with Azure Data services including: Azure Databricks, Azure Synapse Analytics, Azure Data Factory, Azure Data Lake Storage, Azure Log Analytics
- Deep understanding of data warehousing, data lakes, data mesh, lakehouse architecture (with medallion architecture design principles), and streaming and batch processing.
- Extensive experience with data modeling, ETL/ELT pipeline design, and pipeline optimisation.
- Proficiency with CI/CD pipelines, Azure DevOps, and infrastructure automation tools like Terraform.
- Knowledge of secure networking configurations and best practices in cloud data environments.
- Demonstrated ability to manage and optimise Databricks jobs and large-scale distributed systems.
- Security, GDPR and PII data handling principles