Overview
We are looking for a Full Stack Developer with strong experience in building large-scale web scraping systems to design, implement, and manage an external data collection infrastructure for a financial platform. The primary responsibility of this role is to build and maintain a robust, scalable scraping architecture capable of handling high concurrency, anti-bot protections, and large volumes of data extraction. The developer will design the complete pipeline, from scraping orchestration and infrastructure to data normalization and monitoring. The role also requires hands-on experience using modern AI development tools to accelerate development, automate workflows, and improve system reliability.
Key responsibilities
- Design and build scalable web scraping pipelines
- Manage high-concurrency scraping workers
- Implement queue orchestration (Redis, RabbitMQ, Kafka)
- Handle rate limits, proxy rotation, sessions, and cookies
- Manage CAPTCHA solving and anti-bot protections
- Build distributed scraping architectures using Puppeteer / Headless Chrome / similar
- Implement job scheduling, retry logic, and fault tolerance
- Build data normalization and storage pipelines
- Monitor scraping performance, success rates, and failures
- Operate and scale Linux infrastructure on AWS
- Use AI coding tools to assist with development, debugging, and system design
Required experience
- Technical Stack: Languages: 3-5 years of Full-stack proficiency (likely Node.js, Python, or Go, given the Puppeteer/Infra focus). Frameworks: Puppeteer, Playwright, Selenium, or Scrapy. Infrastructure: AWS, Docker/Kubernetes, and Linux CLI.
- Distributed Systems Experience: Proven track record of building systems that handle parallel processing and high-volume data collection.
- Network & Security Knowledge: Deep understanding of HTTP headers, TLS fingerprinting, cookies, and how WAFs (Web Application Firewalls) like Cloudflare or Akamai work.
- AI Fluency: This is a modern requirement—they must be "power users" of AI coding tools (Cursor, Copilot, LLM APIs) to speed up the development of complex extraction logic.
- Problem Solving: The ability to debug "silent failures" where a site changes its layout or increases its bot detection.
Bonus points
- Deep practical experience with AI coding assistants such as Claude Code, Cursor, Copilot, etc.
- Ability to use AI tools to generate, review, and optimize code
- Leveraging LLMs to assist with scraping logic, debugging, and system design
- Building workflows where AI tools are integrated into the development process
- Using AI for log analysis, debugging complex scraping failures, and improving system reliability
To apply
Send your CV, a snappy cover letter which highlights your expertise, skills and experience and any relevant links/attachments to your work.
Have questions?Write to us