Senior Software Engineer-(Data Platform)
Redwood, CIty-CA 94065 | Direct
This is a ground floor opportunity with a big data security analytics with a substantially funded, fast growing start up that enables enterprises to defend against sophisticated cyber threats.
You will participate in building a breakthrough AI/Machine Learning based platform, that can ingest, model and analyze massive flow of machine generated security data. You will help build the next-generation infrastructure and security platform, which includes an application and service delivery platform, massively scalable distributed data storage and replication systems, and a cutting-edge search and behavior visualization system.
Key Drivers Are:
- You will be responsible for the scope, design and development of critical backend components in the product.
- Lead the design discussions with key stakeholders; drive the discussion to conclusions and actions.
- You will have responsibility for developing key platform features that supports large-scale data ingestion, feature extraction, model development and scoring systems built on Apache Spark/ Hadoop
- Participate in design discussions and code reviews
- You will write new design documents and/or append current code, to reflect the latest implementations.
- You will be responsible for writing unit tests and execute on developer level testing before handing over the code to QA.
- Ability to work with agile methodology under a fast pace work environment
- Providing prompt resolutions to customer reported issues.
To Succeed You Must Have:
Experience in building large Data Pipelines is required.
Experience coding in Scala and Spark Performance tuning is required.
10 or more years of experience in building enterprise software using Java with excellent design skills
Experience with big data and distributed system technologies including Hadoop, HBase, ElasticSearch, Spark, Impala.
Strong hands-on development experience with Spark transformations such as map, filter, groupByKey and reduceByKey.
Strong knowledge of Spark RDD/Dataframe join performance optimization and shuffle tuning, Broadcast variables and Accumulators, and Kryo Serialization.
Experience with design, development and delivery of mission critical software products for large enterprises.
Strong experience in Java, Python, Spark and Scala.
Experience with SQL and NoSQL databases. Specifically, Mongodb, Cassandra, RDBMS (Oracle, Postgres), RedShift.
Experience with identity and access management solutions, LDAP, OAuth, OpenID, SAML, JWT, MFA technologies.
Experience with Agile development methodology, test driven development, Git and CI/CD processes.
Solid knowledge and understanding of object-oriented programming, data structures, algorithms, software design. Rigor in high code quality, automated testing, and other engineering best practices.
Deep knowledge of Amazon Web Services ecosystem.
Knowledge of Kubernetes/Docker, Spark, Kafka, Zookeeper and message queue technology.
Knowledge of monitoring, benchmarking, diagnostic and performance optimization tools.
Knowledge of distributed systems (design and development).
Working with data at the petabyte scale.
Security background is a big plus.
Strong verbal, and written communication skills.
· Commitment to proper Software Engineering - Test Driven Development, Documentation, Code reviews, etc.