每日一练 | Data Scientist & Business Analyst & Leetcode 面试题 288
自2017年6月15日起,数据应用学院与你一起温习数据科学(DS)和商业分析(BA)领域常见的面试问题。从2017年10月4号起,每天再为大家分享一道Leetcode算法题。
希望积极寻求相关领域工作的你每天关注我们的问题并且与我们一起思考,我们将会在第二天给出答案。
DS Interview Questions
What’s the assumption of linear regression?
BA Interview Questions
What is the procedure to check the cumulative frequency distribution of any categorical variable in R?
LeetCode Questions
Description:
Given two binary trees, write a function to check if they are equal or not.
Two binary trees are considered equal if they are structurally identical and the nodes have the same value.
Input: 两颗树相同
Output: true
欲知答案如何?请见下期分解!
Day 187 答案揭晓
DS Interview Questions
How can you ensure that you don’t analyse something that ends up producing meaningless results?
Understanding whether the model chosen is correct or not.Start understanding from the point where you did Univariate or Bivariate analysis, analysed the distribution of data and correlation of variables and built the linear model.Linear regression has an inherent requirement that the data and the errors in the data should be normally distributed. If they are not then we cannot use linear regression. This is an inductive approach to find out if the analysis using linear regression will yield meaningless results or not.
Another way is to train and test data sets by sampling them multiple times. Predict on all those datasets to find out whether or not the resultant models are similar and are performing well.
By looking at the p-value, by looking at r square values, by looking at the fit of the function and analysing as to how the treatment of missing value could have affected- data scientists can analyse if something will produce meaningless results or not.
BA Interview Questions
How to check the frequency distribution of a categorical variable in R?
The frequency distribution of a categorical variable can be checked using the table function in R language. Table () function calculates the count of each categories of a categorical variable.
gender=factor(c(“M”,”F”,”M”,”F”,”F”,”F”))
table(sex)
Output of the above R Code –
Gender
F M
4 2
Programmers can also calculate the % of values for each categorical group by storing the output in a dataframe and applying the column percent function as shown below -
t = data.frame(table(gender))
t$percent= round(t$Freq / sum(t$Freq)*100,2)
Leetcode Questions
Description:
Given a string, find the length of the longest substring without repeating characters.
Input: “abcabcbb”
Output: 3
Solution:
典型的同向型两根指针问题的区间类问题。
维持一个符合条件的区间,每次更新最大值
Code:
Time Complexity: O(n)
Space Complexity: O(1)
点击“阅读原文”查看数据应用学院核心课程