Pyspark Engineers Full-Time



Other, Work From Home (100)

Job Description  We are looking for an Associate Consultant to join our growing team of analytics experts. The hire will be responsible for expanding and optimizing our data and data pipeline architecture, as well as optimizing data flow and collection for cross functional teams. The ideal candidate must have exposure to data pipeline building and data wrangling and enjoys optimizing data systems and building them from the ground up. The Data Engineer will support our software developers, data analysts and data scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple teams, systems and products.   Job Responsibilities:  Create and maintain optimal data pipeline architecture.  Assemble large, complex data sets that meet functional non-functional business requirements.  Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc. Build the infrastructure required for optimal extraction, transformation, and loading data from a wide variety of data sources using SQL and big data technologies.  Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.  Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.  Work with data and analytics experts to strive for greater functionality in our data systems.  Desired Candidate Skills:  Hands on experience with distributed computing framework like Hadoop, Hive, Spark-Ecosystem (Spark Core, PySpark, Spark Streaming)  Willing to work with product teams to best optimize product featuresfunctions (a MUST).  Professional curiosity and the ability to enable yourself in new technologies and tasks. Good understanding of SQL and a good grasp of relational and analytical database management theory and practice.  Familiar with the cloud platforms (AWS, Azure, GCP).  General programming skills (Python, Scala).  Basic knowledge of Data Warehouses is a nice to have.       Key Skills:  Python, SQL, Hadoop, Spark Ecosystem, Databricks and Azure Data Factory (nice to have).

  • Active Listening
  • Category Flexibility
  • Complex Problem Solving
  • Computers and Electronics
  • Critical Thinking
  • Customer and Personal Service
  • Deductive Reasoning
  • Design
  • Engineering and Technology
  • English Language
  • Flexibility of Closure
  • Fluency of Ideas
  • Inductive Reasoning
  • Information Ordering
  • Judgment and Decision Making
  • Mathematical Reasoning
  • Mathematics
  • Near Vision
  • Operations Analysis
  • Oral Comprehension
  • Oral Expression
  • Originality
  • Problem Sensitivity
  • Programming
  • Reading Comprehension
  • Selective Attention
  • Speaking
  • Speech Clarity
  • Speech Recognition
  • Systems Analysis
  • Systems Evaluation
  • Written Comprehension
  • Written Expression

Bachelor of Technology (B.Tech) - Computer Science & Engineering

Bachelor of Technology (B.Tech) - Data Science