查看原文
其他

每日一练 | Data Scientist & Business Analyst & Leetcode 面试题 288

2018-02-03 数据应用学院 大数据应用

自2017年6月15日起,数据应用学院与你一起温习数据科学(DS)和商业分析(BA)领域常见的面试问题。从2017年10月4号起,每天再为大家分享一道Leetcode算法题。

希望积极寻求相关领域工作的你每天关注我们的问题并且与我们一起思考,我们将会在第二天给出答案。

Day 188 

DS Interview Questions

What’s the assumption of linear regression?

BA Interview Questions

What is the procedure to check the cumulative frequency distribution of any categorical variable in R?

LeetCode Questions


Description:

  • Given two binary trees, write a function to check if they are equal or not.

  • Two binary trees are considered equal if they are structurally identical and the nodes have the same value.

Input: 两颗树相同

Output: true

欲知答案如何?请见下期分解!

Day 187 答案揭晓

DS Interview Questions

How can you ensure that you don’t analyse something that ends up producing meaningless results?


Understanding whether the model chosen is correct or not.Start understanding from the point where you did Univariate or Bivariate analysis, analysed the distribution of data and correlation of variables and built the linear model.Linear regression has an inherent requirement that the data and the errors in the data should be normally distributed. If they are not then we cannot use linear regression. This is an inductive approach to find out if the analysis using linear regression will yield meaningless results or not.

Another way is to train and test data sets by sampling them multiple times. Predict on all those datasets to find out whether or not the resultant models are similar and are performing well.

By looking at the p-value, by looking at r square values, by looking at the fit of the function and analysing as to how the treatment of missing value could have affected- data scientists can analyse if something will produce meaningless results or not.

BA Interview Questions

How to check the frequency distribution of a categorical variable in R?

The frequency distribution of a categorical variable can be checked using the table function in R language. Table () function calculates the count of each categories of a categorical variable.

gender=factor(c(“M”,”F”,”M”,”F”,”F”,”F”))

table(sex)

Output of the above R Code –

Gender

F  M

4  2


Programmers can also calculate the % of values for each categorical group by storing the output in a dataframe and applying the column percent function as shown below -

t = data.frame(table(gender))

t$percent= round(t$Freq / sum(t$Freq)*100,2)


Leetcode Questions

Description:

  • Given a string, find the length of the longest substring without repeating characters.

Input: “abcabcbb”

Output: 3


Solution:

典型的同向型两根指针问题的区间类问题。

维持一个符合条件的区间,每次更新最大值

Code:

  • Time Complexity: O(n)

  • Space Complexity: O(1)



点击“阅读原文”查看数据应用学院核心课程

您可能也对以下帖子感兴趣

文章有问题?点此查看未经处理的缓存