Data Platform Engineer - Princeton NJ

  • New Jersey
  • Georgia It Inc
Data Platform Engineer Location: Princeton NJ (prefer onsite) Duration: 6 months

No third party C2C Design and implement Azure cloud-based Data Warehousing and Governance architecture with Lakehouse paradigm Integrating technical functionality, ensuring data accessibility, accuracy, and security. Architect the Unity Catalog to provide centralized access control, auditing, lineage, and data discovery capabilities across Databricks workspaces. Define and organize data assets (structured and unstructured) within the Unity Catalog. Enable data analysts and etl engineers to Client and classify data, notebooks, dashboards, and files across clouds and platforms. Implement a single permission model for data and AI assets. Define access policies at a granular level (rows, columns, features) to ensure secure and consistent access management across workspaces and platforms. Leverage Delta Sharing to enable easy data sharing across regions, and platforms. Ensure that data and AI assets can be securely shared with minimal replication, maintaining a unified experience for users. Monitoring and Observability: utilize AI to automate monitoring, diagnose errors, and maintain data and quality. Set up alerts for personally identifiable information (PII) detection, and operational intelligence. Work closely with data scientists, analysts, and engineers to promote adoption of the Unity Catalog. Provide training and documentation to ensure effective usage and compliance with governance policies

Skills:

Designed data warehouse and data lake solutions along with data processing Pipeline using PySpark using Databricks Performed Data Modelling on Databricks [Delta Table] for transactional and analytical need. Designed and developed pipelines to load data to Data Lake Databricks Platform Proficiency, including its components like Databricks SQL, Delta Live Tables, Databricks Repos, and Task Orchestration. Deep understanding of data governance principles, especially related to data cataloging, access control, lineage, and metadata management. Strong SQL skills for querying and managing data Ability to design and optimize data models for structured and unstructured data. Understand how to manage compute resources, including clusters and workspaces. Ability to adapt to changes and emerging trends in data engineering and governance. Involved in hands on development and configuration of Unity Catalog