Your browser cookies must be enabled in order to apply for this job. Please contact support@jobscore.com if you need further instruction on how to do that.

Data Engineer : W2 Onsite role

Information Technology | Jersey City, NJ | Contract

Job Description

Note: This is a w2 role, cannot do C2C

Role : Data Engineer

Location : New Jersey, NJ onsite

Long term contract


About the Role We are looking for a highly skilled Senior Data Engineer with strong expertise in PySpark, Python, SQL, and database technologies, along with exposure to Data Science, AI/ML techniques. The ideal candidate will design and optimize scalable data pipelines, collaborate with cross-functional teams, and contribute to the development of analytical and machine learning–driven solutions.

Key Responsibilities

Data Engineering & Pipeline Development

  • Design, develop, and optimize large-scale ETL/ELT pipelines using PySpark and distributed data processing frameworks.
  • Build high-performance data ingestion workflows from diverse structured and unstructured sources.
  • Implement scalable data models, data marts, and warehousing solutions.


Programming & Database Expertise

• Write clean, modular, and optimized code using Python for data processing and automation.

• Develop complex SQL queries, stored procedures, and performance-tuned database operations.

• Work with relational and NoSQL databases (e.g., MySQL, PostgreSQL, SQL Server, MongoDB, etc.).


Data Science + AI/ML Collaboration

• Partner with Data Science teams to productionize ML models and enable ML-driven pipelines.

• Contribute to model deployment, feature engineering, and ML workflow optimization.

• Integrate ML models into scalable data platforms.


Architecture & Best Practices

• Ensure data quality, reliability, lineage, and governance across data workflows.

• Drive best practices in coding, testing, CI/CD, and cloud-based deployments.

• Work with cross‑functional teams to translate business requirements into robust data solutions.


Required Skills & Qualifications

• 5+ years of experience in Data Engineering with strong hands-on work in PySpark.

• Strong proficiency in Python, including libraries for data processing.

• Advanced knowledge of SQL and performance optimization techniques.

• Experience with distributed data systems (Spark, Databricks, Hive, or similar).

• Exposure to AI/ML workflows, including model deployment or MLOps.

• Solid understanding of data modeling, warehousing concepts, and ETL/ELT architectures.


Good to Have

• US Healthcare domain experience (HIPAA, claims data, EHR/EMR, HL7, FHIR, etc.).

• Experience with cloud platforms (Azure, AWS, GCP).

• Knowledge of MLflow, Airflow, or similar tools.