Job Description
Design and Architect scalable data products using Apache spark and Databricks. Specific duties include, but are limited to, the following: Architect and lead the design of scalable, secure, and high-performance data platforms using Databricks, Apache Spark (PySpark), Delta Lake, and Azure Data Lake Storage. Build and manage robust, real-time data ingestion and processing pipelines using Apache Kafka, enabling low-latency analytics and event-driven applications. Develop complex ETL/ELT workflows using Azure Data Factory, ensuring reliable data transformation, enrichment, and orchestration across cloud-native data services. Integrate diverse internal and external data sources via REST APIs into Azure Data Lake, Databricks, and Snowflake, maintaining high data quality, consistency, and lineage. Implement and optimize Delta Lake architecture for versioned, ACID-compliant data storage within modern lakehouse environments. Lead performance tuning efforts for large-scale distributed computing jobs in Databricks, Spark SQL queries, and Snowflake data warehousing operations. Design and implement data modeling strategies and data governance policies to support enterprise analytics, operational reporting, and advanced AI/ML use cases. Utilize SQL and Python extensively for data analysis, transformation logic, and automation scripts in a unified data engineering workflow. Apply Git-based version control and CI/CD practices to manage codebase integrity, promote collaboration, and support automated deployment of data pipelines. Serve as a strategic advisor on enterprise data architecture, guiding teams on platform scalability, modernization, and best practices in cloud-based data engineering. Work with junior and senior data engineers, facilitate cross-functional collaboration, and stay current on emerging technologies relevant to big data, real-time streaming, and cloud computing. Job requirements: Bachelor's degree in Computer Science, Computer Information Systems, Data Engineering, or Engineering related or Technical related fields plus 5 years of progressively responsible post-baccalaureate experience. Foreign degree equivalent is acceptable. In lieu of the above, we will also accept a Master's degree in Computer Science, Computer Information Systems, Data Engineering, or Engineering related or Technical related fields plus 2 years of experience. We will also accept any suitable combination of education, training and/or experience. Experience should include minimum 2 years of working with Databricks, Snowflake, Apache Spark (PySpark), Delta Lake, Apache Kafka, Azure Data Factory, Azure Data Lake Storage, SQL, Python, Git, REST APIs; working on data modeling, and data governance. HOURS: M-F, 8:00 a.m. 5:00 p.m. JOB LOCATION: Dallas, Texas. Travel is not required, but candidates must be willing to relocate to unanticipated locations across the country per company contract demand. CONTACT: Email resume referencing job code# PDEANB to Maruthi Technologies Inc. DBA Anblicks at