Filter interviews by
Developed a data pipeline to ingest, process, and analyze customer behavior data for targeted marketing campaigns.
Designed and implemented ETL processes to extract data from various sources
Utilized Apache Spark for data processing and analysis
Built machine learning models to predict customer behavior
Collaborated with marketing team to optimize campaign strategies
Snowflake is a cloud-based data warehousing platform known for its scalability, performance, and ease of use.
Snowflake uses a unique architecture called multi-cluster, which separates storage and compute resources for better scalability and performance.
It supports both structured and semi-structured data, allowing users to work with various data types.
Snowflake offers features like automatic scaling, data sharing,...
Group by is used to group rows that have the same values into summary rows, while distinct is used to remove duplicate rows from a result set.
Group by is used with aggregate functions like COUNT, SUM, AVG, etc.
Distinct is used to retrieve unique values from a column or set of columns.
Group by is used to perform operations on groups of rows, while distinct is used to filter out duplicate rows.
Group by is used in co...
Window functions in SQL are used to perform calculations across a set of table rows related to the current row.
Window functions are used to calculate values based on a set of rows related to the current row.
They allow you to perform calculations without grouping the rows into a single output row.
Examples of window functions include ROW_NUMBER(), RANK(), DENSE_RANK(), and NTILE().
Explanation of cumulative sum and rank functions in Spark
Cumulative sum function calculates the running total of a column
Rank function assigns a rank to each row based on the order of values in a column
Both functions can be used with window functions in Spark
Example: df.withColumn('cumulative_sum', F.sum('column').over(Window.orderBy('order_column').rowsBetween(Window.unboundedPreceding, Window.currentRow)))
Exampl...
Commands that run on driver and executor in a word count Spark program.
The command to read the input file and create RDD will run on driver.
The command to split the lines and count the words will run on executor.
The command to aggregate the word counts and write the output will run on driver.
Driver sends tasks to executors and coordinates the overall job.
Executor processes the tasks assigned by the driver.
Identify the number of stages in a given program, focusing on its structure and flow.
Stages represent distinct phases in a program's execution.
Common stages include initialization, processing, and finalization.
Example: In a data pipeline, stages might be data extraction, transformation, and loading (ETL).
Each stage may have specific tasks and dependencies.
Sorting algorithms are methods used to arrange elements in a specific order.
Sorting algorithms are used to rearrange elements in a specific order, such as numerical or alphabetical.
Common sorting algorithms include Bubble Sort, Selection Sort, Insertion Sort, Merge Sort, Quick Sort, and Heap Sort.
Each sorting algorithm has its own time complexity and efficiency based on the size of the input data.
Sorting algorithm...
Slowly changing data handling in Spark involves updating data over time.
Slowly changing dimensions (SCD) are used to track changes in data over time.
SCD Type 1 updates the data in place, overwriting the old values.
SCD Type 2 creates a new record for each change, with a start and end date.
SCD Type 3 adds a new column to the existing record to track changes.
Spark provides functions like `from_unixtime` and `unix_tim...
Answering about joins in SQL and modeling/visualization in PowerBI
Joins in SQL are used to combine data from two or more tables based on a related column
There are different types of joins such as inner join, left join, right join, and full outer join
PowerBI is a data visualization tool that allows users to create interactive reports and dashboards
Data modeling in PowerBI involves creating relationships between tab...
I appeared for an interview in Jan 2025.
Basic Pyspark/python question with some mcqs
I am passionate about data engineering and believe this company offers unique opportunities for growth and learning.
Exciting projects and challenges at this company
Opportunities for growth and learning in data engineering
Alignment with company values and culture
Potential for career advancement and development
Interest in the industry or specific technologies used by the company
I applied via Approached by Company and was interviewed in Jun 2024. There were 4 interview rounds.
Snowflake is a cloud-based data warehousing platform known for its scalability, performance, and ease of use.
Snowflake uses a unique architecture called multi-cluster, which separates storage and compute resources for better scalability and performance.
It supports both structured and semi-structured data, allowing users to work with various data types.
Snowflake offers features like automatic scaling, data sharing, and ...
Jinja in dbt allows for dynamic SQL generation using templating syntax.
Use `{{ }}` for expressions, e.g., `{{ ref('my_model') }}` to reference another model.
Use `{% %}` for control flow, e.g., `{% if condition %} ... {% endif %}` for conditional logic.
Loop through lists with `{% for item in list %} ... {% endfor %}`.
Define variables with `{% set var_name = value %}` and use them with `{{ var_name }}`.
I applied via Walk-in
A general round having mix of maths,aptitude,english,puzzle and quiz. level was easy
I would analyze the situation, identify the root cause, and propose a solution based on data-driven insights.
Analyze the data to understand the problem
Identify the root cause of the issue
Propose a solution based on data-driven insights
I applied via Recruitment Consulltant and was interviewed before Dec 2023. There were 4 interview rounds.
I cannot recall the platform, but it featured a combination of Python and SQL questions, with the majority being focused on SQL.
I applied via Naukri.com and was interviewed in Feb 2023. There were 4 interview rounds.
Sql and python coding conducted
I applied via LinkedIn and was interviewed before Jan 2024. There were 3 interview rounds.
Basic sql online assessment with some queries and mcq
Developed a data pipeline to ingest, process, and analyze customer behavior data for targeted marketing campaigns.
Designed and implemented ETL processes to extract data from various sources
Utilized Apache Spark for data processing and analysis
Built machine learning models to predict customer behavior
Collaborated with marketing team to optimize campaign strategies
I applied via Naukri.com and was interviewed before Aug 2023. There were 4 interview rounds.
Quantitative Analysis, Data Interpretation, Logical Reasoning
I appeared for an interview before Jun 2024, where I was asked the following questions.
Top trending discussions
Some of the top questions asked at the Fractal Analytics Data Engineer interview -
The duration of Fractal Analytics Data Engineer interview process can vary, but typically it takes about less than 2 weeks to complete.
based on 20 interview experiences
Difficulty level
Duration
based on 95 reviews
Rating in categories
Consultant
1.2k
salaries
| ₹6 L/yr - ₹24.8 L/yr |
Data Engineer
924
salaries
| ₹7.8 L/yr - ₹26 L/yr |
Senior Consultant
735
salaries
| ₹11 L/yr - ₹40 L/yr |
Data Scientist
561
salaries
| ₹8 L/yr - ₹31 L/yr |
Senior Data Scientist
323
salaries
| ₹12 L/yr - ₹42 L/yr |
Kiya.ai
MathCo
Innovatiview India Ltd
Zeta