Home >Careers >CMREC-1435 Senior Big Data Engineer

CMREC-1435 Senior Big Data Engineer

Overview

We are seeking a Senior Big Data Engineer with a strong background in managing structured and unstructured data pipelines, who thrives in a fast-paced AI-focused environment. You will be instrumental in building and scaling our data lake architecture, supporting a system designed to fuel intelligent AI agents for data collection, labeling, and analytical reasoning. This includes integrating vector databases and optimizing for retrieval-augmented generation (RAG) workflows deployed on AWS Bedrock and other AI stacks.

Key responsibilities

Design and implement scalable ingestion pipelines for structured/unstructured data using AWS and Databricks Unity Catalog.
Build and maintain high-throughput ETL/ELT pipelines with Apache Airflow and Databricks.
Architect and manage data modeling, storage, and indexing strategies in PostgreSQL and RDS, ensuring compatibility with AI retrieval systems.
Integrate and manage vector databases to support fast semantic and embedding-based search in RAG pipelines.
Implement robust data validation, lineage, and governance systems using Unity Catalog.
Optimize performance across distributed compute environments (Databricks, EC2).
Deploy and maintain Lambda-based microservices for scalable, real-time data ingestion and enrichment.

Required experience

5+ years working with big data systems in production environments.
Proven expertise with Databricks, Unity Catalog, and Apache Spark.
Proficiency in Airflow, AWS stack (Lambda, EC2, RDS), and cloud-based data lake architectures.
Strong SQL and database design skills (PostgreSQL preferred).
Working knowledge of vector databases (Chroma, Pinecone, FAISS).
Solid understanding of data lifecycle management in ML/AI contexts.
Bonus: Familiarity with LangGraph, LangSmith, LangChain, or similar agent orchestration tools.

Bonus points

Experience with AI agent pipelines or large-scale ML model support.
Emphasis on data observability, security, and lineage tracking.
Hands-on with RAG architecture, including vector storage and semantic retrieval.
Exposure to AWS Bedrock and model deployment orchestration.

To apply

Send your CV, a snappy cover letter which highlights your expertise, skills and experience and any relevant links/attachments to your work.

Apply here

Have questions?Write to us

Careers

Open vacancies

View all vacancies