**職位:高級數(shù)據(jù)工程師 - **
**經(jīng)驗要求:6-9年**
**職位概述**
我們正在尋找高級數(shù)據(jù)工程師資源,負責(zé)將應(yīng)用程序從傳統(tǒng)的Cloudera環(huán)境遷移到新的基于Kubernetes的數(shù)據(jù)平臺。該職位要求具備扎實的數(shù)據(jù)工程開發(fā)技能,并能夠在內(nèi)部負責(zé)人的指導(dǎo)下交付高質(zhì)量的數(shù)據(jù)管道。
**主要職責(zé)**
? 使用 Spark 3.5 和 Python/Scala 開發(fā)并優(yōu)化數(shù)據(jù)管道。
? 將現(xiàn)有的 Hive、Spark 和 Control-M 作業(yè)遷移至基于 Airflow 和 DBT 的工作流。
? 將數(shù)據(jù)管道與消息系統(tǒng)(Kafka, Solace)和對象存儲(S3, MinIO)集成。
? 對在 Kubernetes 環(huán)境中運行的分布式作業(yè)進行故障排查和性能優(yōu)化。
? 與內(nèi)部負責(zé)人和架構(gòu)師緊密合作,實施最佳實踐。
? 設(shè)計并實施遷移/加速框架,以自動化端到端的遷移過程。
? 持續(xù)改進框架,確保其穩(wěn)定性、可擴展性以及對多種用例和場景的支持。
? 與各類數(shù)據(jù)應(yīng)用程序協(xié)作,以啟用并支持遷移過程。
? 在商定的時間范圍內(nèi)完成分配的遷移任務(wù)。
**必備技能**
? 6-9 年扎實的數(shù)據(jù)工程實戰(zhàn)經(jīng)驗。
? 精通 Apache Spark(批處理 + 流處理)和 Hive。
? 熟練掌握 Python、Scala 或 Java。
? 了解編排工具(Airflow / Control-M)和 SQL 轉(zhuǎn)換框架(優(yōu)先考慮 DBT)。
? 具有使用 Kafka、Solace 和對象存儲(S3, MinIO)的經(jīng)驗。
? 接觸過 Docker/Kubernetes 部署。
? 具備數(shù)據(jù)湖倉一體格式(Iceberg, Delta Lake, Hudi)的實戰(zhàn)經(jīng)驗。
Position: Senior Data Engineer – Vendor
Experience: 6–9 Years
Role Summary
We are seeking Senior Data Engineer resources to work on the migration of applications from
our legacy Cloudera environment to the new Kubernetes-based data platform. The role requires
strong hands-on development skills in data engineering, with the ability to deliver high-quality
pipelines under guidance from internal leads.
Key Responsibilities
? Develop and optimize data pipelines using Spark 3.5 and Python/Scala.
? Migrate existing Hive, Spark, and Control-M jobs to Airflow and DBT-based workflows.
? Integrate data pipelines with messaging systems (Kafka, Solace) and object stores (S3,
MinIO).
? Troubleshoot and optimize distributed jobs running in Kubernetes environments.
? Collaborate closely with internal leads and architects to implement best practices.
? Design and implement migration/acceleration framework to automate end to end
migration.
? Continuous enhancements to the frameworks to ensure the stability, scalability and
support for diverse use cases and scenarios.
? Work with various data applications to enable and support the migration process.
? Deliver assigned migration tasks within agreed timelines.
Required Skills
? 6–9 years of hands-on data engineering experience.
? Strong expertise in Apache Spark (batch + streaming) and Hive.
? Proficiency in Python, Scala, or Java.
? Knowledge of orchestration tools (Airflow / Control-M) and SQL transformation
frameworks (DBT preferred).
? Experience working with Kafka, Solace, and object stores (S3, MinIO).
? Exposure to Docker/Kubernetes for deployment.
? Hands on experience of data Lakehouse formats (Iceberg, Delta Lake, Hudi).