Build distributed, scalable, and reliable frameworks and data pipelines that ingest, egest and process data at scale and often, in real-time.
Collaborate with other teams to design, develop and deploy data tools that support both operations and product use cases.
Triage and/or perform offline analysis of large data sets using components from the Hadoop ecosystem or relational technologies.
Evaluate and advise on technical aspects, timelines and estimation of open work requests in the product backlog.
Own product features from the development phase through to production deployment to support.
Evaluate big data technologies and prototype solutions to improve our data processing architecture.
Interface with business, architecture and technology partners to create holistic information business solutions and capabilities
Become Subject Matter Expert on integration processes and data content related to the Customer Domain including the source, internal and destination systems
Ability to balance several effort in parallel, look for improvement opportunities and to properly and precisely communicate status, challenges and successes
Be able to work in an agile environment embracing collaboration within and across teams
BS in Computer Science or related area
At least 5 years of overall software development experience
Minimum 3 year experience on Big Data Platform such as Cloudera or AWS
Current experience with Scala, Java, Oracle, HBase, Hive, Kafka, Spark, Shell Scripting and SQL. Python experience desired
Experience and passion for data, including data design, data architectures, pros and cons of relational and non-relational solutions and technologies and how to bring efficiency in big data pipelines
Understanding of automated QA needs related to Big data
Understanding of various Visualization platforms
Experience in Cloud providers - AWS preferable
Proficiency with Agile or Lean development practices
Strong object-oriented design and analysis skills
Excellent written and verbal communication skills
Top skill sets/technologies in the ideal candidate:
* Programming language -- Java (must), Python, Scala, Shell Scripts
* Database/Search - Oracle, complex SQL queries, stored procedures, performance tuning concepts, SOLR, AWS RDS, AWS Aurora
* Batch processing -- Hadoop MapReduce, Cascading/Scalding, Apache Spark, AWS EMR
* Stream processing -- Spark streaming, Kafka, Apache Storm, Flink
* NoSQL -- HBase, Cassandra, MongoDB, DynamoDB, Hive, Impala
* ETL/ELT Frameworks including monitoring, alerting, restart
* Code/Build/Deployment -- git, svn, maven, sbt, jenkins, bamboo:
* Excellent communication both up, down and outward.
* Strong analytical, problem solving and decision-making skills.
* Zeal to learn new technologies, frameworks and to grow technically and professionally
* Identify project risks and recommend mitigation efforts.
* Identify project issues, communicate them and assist in their resolution.
* Assist in continuous improvement efforts in enhancing project team methodology and performance.
* Cooperative team focused attitude