a*****8 发帖数: 26 | 1 Contact me at [email protected] if interested in.
This position is in the same group as me. 地址 Burlington, MA
The data engineer will work within the Data Management Platform (DMP) team
and is responsible for the development and maintenance of data from
Endurance, Constant Contact and third party sources in both batch and near
real time to analytic systems and customer touch point systems such as
Genesys and Salesforce. The DMP is built on the Hortonworks Data Platform
distribution of the hadoop file systems and uses the Hortonworks Data Flow
distribution featuring Apache Nifi to move data from enterprise sources to
manage data in motion securely and efficiently. The senior engineer will
design the extract layer and as well as the target API to move data in near
real time and/or batch oriented from brand operational system. We are
looking for a Big Data engineer that will work on the collecting, storing,
processing and analyzing o huge datasets. The primary focus will be on
choosing optimal solutions to use for these purposes, then maintaining,
implementing, and monitoring them. You may also be responsible for
integrating them with the architecture used across the company.
The candidate will also support or operationalize Data Science team work.
The candidate must have previous experience with database development and
data analysis. Extensive experience with Structured Query Language (SQL) or
Hive Query Language (Stinger 2.0 or better) is a must. In addition, strong
understanding of Nifi and Spark is essential. Either a bachelor's degree
in computer science or a related field or a level of work experience is an
appropriate substitute for a degree. Specifically, the individual should
have familiarity in the Apache Hadoop stack, particularly but not limited to
oozie, pig, zeppelin and ambari. Knowledge in sed/awk shell scripting as
well as Scala or Java is highly recommended.
The engineer should be highly organized with skills that include excellent
written and verbal communication, problem solving, data analysis and the
ability to work alone or with others as needed. Knowledge of how to work
with multi-database environments integral to their database development
position, so prior experience is a plus. The DMP team works with outside
technologies and development programs, so the ability to pick up technical
skills quickly and adapt to new technologies as they are introduced is
imperative.
Database Engineer Tasks:
Proficient understanding of distributed computing principles
Proficiency with Hadoop v2, MapReduce, HDFS.
Good knowledge of Big Data querying tools, such as Pig, Hive, and Impala
Experience with Spark is highly recommend
Experience with integration of data from multiple data sources
Document work of operational responsibilities.
Create UNIX shell scripts
Perform Quality Assurance and Analysis
Develop Java/Python/Scala code
Develop efficient code in terms of performance and resource utilization
Implementing ETL process
Selecting and integrating any BigData tools and frameworks required to
provide requested capabilities
Basic Non-functional Skills:
Familiar with Kanban Development
Ability to quickly adapt to changing Apache landscape
Experienced in leveraging Open Source
Familiar with Source Control Management |
|