- Experience in building scalable near real-time data pipelines using Apache Spark, Kafka, RocksDB and HBase
- Good understanding of Hadoop components such as HDFS, MapReduce, YARN and other Hadoop-related frameworks such as Spark, Pig, Hive, Sqoop, Oozie, ZooKeeper and HBase.
- Experience with Cloudera distribution for Apache Hadoop (CDH) and Cloudera Manager to install, configure and monitor Hadoop clusters
- Experience in install, configure and administer Kafka clusters.
- Working knowledge of Apache Maven, Git for build, deployment and continuous integration.
- Experience in ingestion, storage, querying, processing and analysis of big data
We’ve updated our privacy policy so that we are compliant with changing global privacy regulations and to provide you with insight into the limited ways in which we use your data.
You can read the details below. By accepting, you agree to the updated privacy policy.