Overview

We are looking for a Full Stack Developer with strong experience in building large-scale web scraping systems to design, implement, and manage an external data collection infrastructure for a financial platform. The primary responsibility of this role is to build and maintain a robust, scalable scraping architecture capable of handling high concurrency, anti-bot protections, and large volumes of data extraction. The developer will design the complete pipeline, from scraping orchestration and infrastructure to data normalization and monitoring. The role also requires hands-on experience using modern AI development tools to accelerate development, automate workflows, and improve system reliability.

Key responsibilities

  • check-circle
    Design and build scalable web scraping pipelines
  • check-circle
    Manage high-concurrency scraping workers
  • check-circle
    Implement queue orchestration (Redis, RabbitMQ, Kafka)
  • check-circle
    Handle rate limits, proxy rotation, sessions, and cookies
  • check-circle
    Manage CAPTCHA solving and anti-bot protections
  • check-circle
    Build distributed scraping architectures using Puppeteer / Headless Chrome / similar
  • check-circle
    Implement job scheduling, retry logic, and fault tolerance
  • check-circle
    Build data normalization and storage pipelines
  • check-circle
    Monitor scraping performance, success rates, and failures
  • check-circle
    Operate and scale Linux infrastructure on AWS
  • check-circle
    Use AI coding tools to assist with development, debugging, and system design

Required experience

  • tick
    Technical Stack: Languages: 3-5 years of Full-stack proficiency (likely Node.js, Python, or Go, given the Puppeteer/Infra focus). Frameworks: Puppeteer, Playwright, Selenium, or Scrapy. Infrastructure: AWS, Docker/Kubernetes, and Linux CLI.
  • tick
    Distributed Systems Experience: Proven track record of building systems that handle parallel processing and high-volume data collection.
  • tick
    Network & Security Knowledge: Deep understanding of HTTP headers, TLS fingerprinting, cookies, and how WAFs (Web Application Firewalls) like Cloudflare or Akamai work.
  • tick
    AI Fluency: This is a modern requirement—they must be "power users" of AI coding tools (Cursor, Copilot, LLM APIs) to speed up the development of complex extraction logic.
  • tick
    Problem Solving: The ability to debug "silent failures" where a site changes its layout or increases its bot detection.

Bonus points

  • tick
    Deep practical experience with AI coding assistants such as Claude Code, Cursor, Copilot, etc.
  • tick
    Ability to use AI tools to generate, review, and optimize code
  • tick
    Leveraging LLMs to assist with scraping logic, debugging, and system design
  • tick
    Building workflows where AI tools are integrated into the development process
  • tick
    Using AI for log analysis, debugging complex scraping failures, and improving system reliability

To apply

Send your CV, a snappy cover letter which highlights your expertise, skills and experience and any relevant links/attachments to your work.

Apply here

Have questions?Write to us

Careers

Open vacancies

View all vacancies
  • We are looking for a highly skilled and product-driven Senior Frontend Engineer to join our engineering team. This role combines advanced frontend development with hands-on involve...

    Learn more
  • CMREC-1781 Kotlin AI Engineer

    Deadline: 31 March 2026

    You will be one of the first core engineers building an AI platform that lets vendors rapidly create and deploy conversational AI agents (avatars, chat, voice) across websites, Lin...

    Learn more
  • We are looking for a highly experienced Senior React Native Engineer to serve as their in-house mobile authority. This is a high-impact role where you will take full ownership of t...

    Learn more
  • CMREC-1723 Embedded Software Engineer

    Deadline: 14 February 2026

    Role overview We are looking for an Embedded Software Engineer to join the core development team and lead the firmware layer that connects our hardware security engine to drone com...

    Learn more
  • CMREC-1576 Backend Software Engineer

    Deadline: 14 February 2026

    As a Backend Software Engineer on the Integrations Team, you will be responsible for developing and maintaining Python-based integrations that fetch and process data—primarily asse...

    Learn more