Big Data Architect - Data Infrastructure
1111 - Technical Operations | Redwood City, CA | Full Time
Amobee is a technology company that transforms the way brands and agencies make marketing decisions. The Amobee Marketing Platform enables marketers to plan and activate cross-channel, programmatic media campaigns using real-time market research, proprietary audience data, advanced analytics, and more than 150 integrated partners, including Facebook, Instagram, Pinterest, Snapchat and Twitter. Amobee is a wholly owned subsidiary of Singtel, one of the largest communications technology companies in the world which reaches over 640 million mobile subscribers. The company operates across North America, Europe, Middle East, Asia, and Australia. For more information, visit amobee.com or follow @amobee
Data Infrastructure team at Amobee is responsible for architecting, building and managing Big data ecosystem components, like very large Hadoop clusters with tens of PetaByte scale, Spark, extremely fast-changing RDBMS environments and different storage environments. This is an agile environment that provides an opportunity to solve complex and scaling challenges.
As a Big Data Architect, you will be providing hands-on technical leadership for big data components. You will contribute to architect big data systems, performance improvements and optimization of Amobee’s Hadoop ecosystem components. You will be contributing to improve and run the systems smoothly by bringing in best practices on the development and operational front including process improvements. You’ll get the chance to take on complex and interesting problems as part of a fast-paced, highly collaborative team. Additionally, you will be contributing to identifying best fit technical solutions to solve different business problems. The demand on these systems is increasing rapidly as more and new type of data is getting ingested, and as we add more functionality and products will start using it.
The successful candidate for this position will be self-motivated and be able to work independently with an attitude of getting things done. Has a keen interest in finding answers to complex problems. Should be able to see the big picture and also be able to deep-dive into details to solve complex problems.
- Develop technical design, architect to process, extract, cleanse, integrate, organize and present data from a variety of sources and formats for analysis and use across use cases.
- Design a secure, highly-scalable, reliable, and performant big data platform to consume, integrate and analyze complex data using a variety of best-in-class open-source platforms and tools.
- Hands on experience in performance tuning, optimization, advanced troubleshooting and tuning to ensure an efficient production environment to meet SLA.
- Perform workload profiling, data profiling, capacity planning, and analyze to recommend and implement solutions.
- Partner with different Engineering development teams to establish an architectural plan across multiple solution areas and mentor members of the engineering development and operations teams.
- Develop and maintain operational best practices for smooth operation of large Hadoop clusters.
- 5+ years in deploying and administering the multi petabyte-scale Hadoop cluster, preferably Cloudera distribution.
- Strong understanding of data and information architecture, including experience with Big Data, relational databases, real-time streaming, and batch data processing.
- Proven hands-on experience with performance tuning and troubleshooting at scale on Hadoop, Yarn, MapReduce, Spark, Kafka, HBase, and Pig
- Expert level understanding of Hadoop design principles and the factors that affect distributed system performance.
- Good understanding of Hadoop internals and other ecosystem products like HDFS, map-reduce, Spark, HBase, Pig, Oozie, Zookeeper, Cloudera manager and Linux operating system.
- Well versed in performance tuning of Hadoop clusters and workload running Hadoop, YARN, Spark, HBase.
- Good scripting experience with at least two of the following: Python, Ruby, Perl or Shell.
- Good knowledge of open source tools like TSDB, grafana, and nagios
- Good understanding of configuration management tools, like Puppet.
- BS/MS degree in computer science or related field.
- Hands on experience in developing Java, Scala or Python is a big plus.
- Experience in developing real-world applications based on Hadoop & Spark ecosystem is a big plus
- Good knowledge of common ETL packages/libraries, data ingestion, and programming language.
- Good exposure to Machine Learning techniques and practices.
- Working knowledge of open source projects like Git, TSDB, Docker, K8s, public cloud.
Location: Redwood City, CA
In addition to our great environment, we offer a competitive base salary, bonus program, stock options, employee development programs, and other comprehensive benefits. Please send a cover letter along with your resume when applying to the position of interest located at Amobee.com. We are an Equal Opportunity Employer. No phone calls and no recruiting agencies, please.