查看原文
其他

每日一练 | Data Scientist & Business Analyst & Leetcode 面试题 313

数据应用学院 大数据应用 2018-07-14

自2017年6月15日起,数据应用学院与你一起温习数据科学(DS)和商业分析(BA)领域常见的面试问题。从2017年10月4号起,每天再为大家分享一道Leetcode算法题。

希望积极寻求相关领域工作的你每天关注我们的问题并且与我们一起思考,我们将会在第二天给出答案。

Day 213 

DS Interview Questions

Is orthogonal necessary in PCA? If yes, Why? What will happen if you don’t rotate the components? 

BA Interview Questions

R language

write a for() loop that uses next to print all values except “3” in the following variable:

i <- 1:5

LeetCode Questions

Description:

  • Given a digit string, return all possible letter combinations that the number could represent.

  • A mapping of digit to letters (just like on the telephone buttons) is given below.

  • Input: Digit string “23”

Output: [“ad”, “ae”, “af”, “bd”, “be”, “bf”, “cd”, “ce”, “cf”]

Assumptions:

  • The digits are between 2 and 9


欲知答案如何?请见下期分解!

Day 212 答案揭晓

DS Interview Questions

What is Principal Component Analysis? What are its applications and limitations?

Suppose we want to analyze a dataset with n observations on a set of p features.  When p is very large, it is likely that none of the features alone will be informative since each just contain a very small fraction of the total information.  Each of the n observations lives in p-dimensional space, but these p dimensions are not equally interesting.  PCA seeks to find a small number ofdimensions that are as interesting as possible, where the concept of

‘interesting’ is measured by the amount that the observations vary along each dimension.


Each of the dimensions found by PCA is a linear combination of  the p features. For instance, the first principal component is:

subject to

The vecto defines a direction in feature space along which the data varythe most.  If we project the n data points onto this direction, the projected values are the principal component scores.


Applications: we can adapt regression, classification, and clustering methods by using the first K<<p principal component score vectors as features, which will lead to much less noisy results.  Other applications include data compression: for example, we can take the first few principal components of image data to compress image files.


Limitations:

Sometime the variance of the data may not be a good measurement of our interest of the data.  For instance, if we want to cluster the following data by color:

PCA will result in the purple line as the projection line while reduced-rank LDA will result in the green line as the projection line.  It is easy to see that if we project our data according to PCA, the data will not be well separated at all, while if we project our data by LDA, clustering is clear.


BA Interview Questions

R language:

write a while() loop that prints the variable, “i“, that is incremented from 2 – 5, and uses the next statement, to skip the printing of the number 3.

** The next statement is used within loops in order to skip the current evaluation, and instead proceed to the next evaluation


i <- 1

while(i<5){

 i=i+1

 if(i==3) next

 print(i)

}


  Leetcode Questions

Description:
Write a function to find the longest common prefix string amongst an array of strings.
Input:[“aasdfgas”, “aaasafda”]
Output:“aa”


Solution:
多种解法
最巧妙地是排序之后比较首尾
二分也可以通过测试

Code:

Time Complexity: O (nlog(n))






点击“阅读原文”查看数据应用学院核心课程


    您可能也对以下帖子感兴趣

    文章有问题?点此查看未经处理的缓存