- Develops distributed applications to solve large scale processing problems, utilizing various languages like Java, Scala , Shell etc.
- Implements, troubleshoots, and optimizes solutions based on modern big data technologies like Hadoop, Spark, Elastic Search, Storm, Kafka, etc. in both an on premise and cloud deployment model
- Implements data architecture, including data ingress in batch and real time from a broad variety of external systems; data transformations to prepare data for analytics processing, and data egress for availability of analytics results to visualization systems, applications, or external data stores
- Supports documentation, change control, and QA processes consistent with enterprise requirements
- Establishes strong teamwork with client technical resources, and effectively communicates project status, technical issue options and resolution, and operational requirements to client stakeholders
- Very strong server-side Java experience, especially in an open source, data-intensive, distributed environments
- Overall 5-7 years of experience with minimum of 2 years in Hadoop
- Expert in the Hadoop Framework and programming (Spark, MapReduce, Pig, Hive, Kafka, Storm, etc.) including performance tuning
- Implemented complex projects dealing with the considerable data size (TB/ PB) and with high complexity
- Good understanding of algorithms, data structure, and performance optimization techniques
- Experience with agile development methodologies like Scrum
- Self motivated, and has the ability to drive technical discussions. Organized, detail oriented, able to work both independently and in a team
- Excellent problem solver, analytical thinker and quick learner
- Strong verbal and written communication skills
- Broad understanding of all of the following, with depth of expertise and experience in at least 3:
- o Hadoop security (Kerberos, Ranger, Knox)
- o Amazon EMR and related technologies (e.g. DynamoDB, Kinesis, S3)
- o Data mining, statistical modeling techniques and quantitative analyses
- o Data Architecture, Master Data Management and Governance
- o Kafka
- o Search capabilities such as Elastic Search
- o NoSQL DB such as Cassandra and MongoDB
- Certifications a plus: Amazon, Cloudera, Spark
- Masters / Bachelor of Computer science with focus on distributed computing
Kogentix is an artificial intelligence and big data software and services firm based outside of Chicago, with offices in Hyderabad, India; Silicon Valley; Singapore; Jakarta, Indonesia; and locations across the U.S. Kogentix delivers practical AI fueled by big data. Our flagship product, the Kogentix Automated Machine Learning Platform, or AMP, enables organizations to rapidly innovate machine learning applications. Kogentix software and services are used by leaders in a range of vertical markets, including financial services, consumer goods, healthcare, telecommunications, and industrial equipment. Kogentix is a great place to work. Our staff is extraordinarily talented, creating an intellectually invigorating environment. The leadership team is focused on the long term growth of our employees, both personally and professionally. Our clients are tough, but the challenges they give us maintain a sharp edge on our technical acumen. Our culture is open, honest, and empowering. If these attributes interest you, check out our job openings and drop us a line.
Want to learn more about Kogentix? Visit Kogentix's website.
Reddit is an American social news aggregation, web content rating, and discussion website.