TGS provides scientific data and intelligence to companies active in the energy sector. In addition to a global, extensive and diverse energy data library, TGS offers specialized services such as advanced processing and analytics alongside cloud-based data applications and solutions.
About the job
As a data engineer and scientist, you get the best out of two worlds. Not only are you passionate about data engineering, but you also are passionate about crunching the data you engineer.
In this role, you will be a key member of the Analytics and Data Engineering Science (ADES) group. You will design and build data marts that will empower and accelerate our analytics offering. You possess expertise in architecting data lakes, designing and implementing necessary data pipelines, and establishing governance and quality processes to guarantee data availability, usability, and accuracy, while at the same time, understanding the requirements and science of data analytics.
Requirements
- Conceptualize and own the build out of problem-solving data marts for consumption by data science and analytics teams, evaluating design and operational cost-benefit tradeoffs within systems
- Design, develop, and maintain robust data pipelines and ETL processes using data platforms for Prediktor’s data warehouse
- Create or contribute to frameworks that improve the efficacy of logging data, while working to improve the data engineering infrastructure
- Validate data integrity throughout the collection process, performing data profiling to identify and comprehend data anomalies
- Influence product and cross-functional (engineering, data science, marketing, strategy) teams to identify data opportunities to drive impact
Qualifications
Bachelor's degree in Computer Science, Mathematics, or related field, or equivalent practical experience3 years of experience coding with SQL or one or more programming languages (e.g., Python, R, C#) for data manipulation, analysis, and automationExperience in data transformation, modeling and warehousing concepts and techniques, and database principles (e.g., SQL and NoSQL)Experience with cloud-based services relevant to data engineering, data storage, data processing, data warehousing, real-time streaming, and serverless computingExperience with data processing software (e.g., Python, Hadoop, Spark, Pig, Hive, Kafka, Airflow) and algorithms (e.g., MapReduce, Flume, R, TensorFlow, PyTorch, Scikit-learn)Experience with data visualization tools (e.g., Power BI, Looker, Bold BI, Grafana)