Middle Big Data Engineer (AdTech)
- Remote (Ukraine only)
- Big Data
On behalf of Beeswax, Sigma Software is looking for a Middle Big Data Engineer to join Userdata team. The team’s mission is to build and operate a large-scale data platform that can ingest terabytes of data and serve millions of queries with single millisecond tail latency.
Our customer is rapidly growing US AdTech company Beeswax. Founded by three ex-Googlers, it has a highly technical team and an excellent technological culture.
Beeswax (https://www.beeswax.com/about/) provides extremely high-scale Bidder-as-a-Service solutions in advertising technology, works with global businesses, and has to date raised $28M (incl. the most recent Series B raise of $15M).
Sigma Software works together with Beeswax to enable the delivery of numerous key components of the platform, and is looking for engineers to complement Beeswax engineering team and drive further development of the platform.
The project is about building the next generation of real-time bidding software that enables sophisticated marketers to break free from the limitations and constraints of opaque, one-size-fits-all programmatic buying platforms.
The streaming applications are written in Java, use Kinesis, and handle various scaling and configuration challenges. Batch ETL is managed by Airflow DAGs. Having an excellent working knowledge of SQL is critical as we do a number of the ETL steps in Snowflake, and a poorly written query could have a significant performance and cost impact
You will join the Big Data team on a customer’s side and cooperate closely to improve a platform looking into new features and become more efficient performance-wise. The team currently works on several tools and applications. By joining our team, you would work with some or all of them:
- Manage and build on a high-scale event parsing and recording system. We use Kinesis to handle billions of events and ship them to S3, Snowflake, databases, and a variety of other logs, both internal and external
- Manage and build on a set of ETL pipelines that move terabytes of data through Snowflake
- Operate multiple services that provide real-time data flows to our internal systems (both the UI and the optimization engines)