Software Engineer- (Data Platform)
Redwood, CIty-CA 94065
Software Engineer (Data Platform)
This is a ground floor opportunity with a big data security analytics with a substantially funded, fast growing start up that enables enterprises to defend against sophisticated cyber threats.
You will participate in building a breakthrough AI/Machine Learning based platform, that can ingest, model and analyze massive flow of machine generated security data. You will help build the next-generation infrastructure and security platform, which includes an application and service delivery platform, massively scalable distributed data storage and replication systems, and a cutting-edge search and behavior visualization system.
Key Drivers Are:
- You will be responsible for the scope, design and development of critical backend components in the product.
- Lead the design discussions with key stakeholders; drive the discussion to conclusions and actions.
- You will have responsibility for developing key platform features that supports large-scale data ingestion, feature extraction, model development and scoring systems built on Apache Spark/ Hadoop
- Participate in design discussions and code reviews
- You will write new design documents and/or append current code, to reflect the latest implementations.
- You will be responsible for writing unit tests and execute on developer level testing before handing over the code to QA.
- Ability to work with agile methodology under a fast pace work environment
- Providing prompt resolutions to customer reported issues.
To Succeed You Must Have:
Experience in building large Data Pipelines is required.
Experience coding in Scala and Spark Performance tuning is required.
· 7 or more years of experience in building enterprise software using Java with excellent design skills
· Experience with big data and distributed system technologies including Hadoop, HBase, ElasticSearch, Spark, Impala.
· Strong hands-on development experience with Spark transformations such as map, filter, groupByKey and reduceByKey.
· Strong knowledge of Spark RDD/Dataframe join performance optimization and shuffle tuning, Broadcast variables and Accumulators, and Kryo Serialization.
· Experience with design, development and delivery of mission critical software products for large enterprises.
· Strong experience in Java, Python, Spark and Scala.
· Experience with SQL and NoSQL databases. Specifically, Mongodb, Cassandra, RDBMS (Oracle, Postgres), RedShift.
· Experience with identity and access management solutions, LDAP, OAuth, OpenID, SAML, JWT, MFA technologies.
· Experience with Agile development methodology, test driven development, Git and CI/CD processes.
· Solid knowledge and understanding of object-oriented programming, data structures, algorithms, software design. Rigor in high code quality, automated testing, and other engineering best practices.
· Deep knowledge of Amazon Web Services ecosystem.
· Knowledge of Kubernetes/Docker, Spark, Kafka, Zookeeper and message queue technology.
· Knowledge of monitoring, benchmarking, diagnostic and performance optimization tools.
· Knowledge of distributed systems (design and development).
· Working with data at the petabyte scale.
· Security background is a big plus.
· Strong verbal, and written communication skills.
· · Commitment to proper Software Engineering - Test Driven Development, Documentation, Code reviews, etc.