每日一练 | Data Scientist & Business Analyst & Leetcode 面试题 313
自2017年6月15日起,数据应用学院与你一起温习数据科学(DS)和商业分析(BA)领域常见的面试问题。从2017年10月4号起,每天再为大家分享一道Leetcode算法题。
希望积极寻求相关领域工作的你每天关注我们的问题并且与我们一起思考,我们将会在第二天给出答案。
DS Interview Questions
Is orthogonal necessary in PCA? If yes, Why? What will happen if you don’t rotate the components?BA Interview Questions
R language
write a for() loop that uses next to print all values except “3” in the following variable:
i <- 1:5
LeetCode Questions
Description:
Given a digit string, return all possible letter combinations that the number could represent.
A mapping of digit to letters (just like on the telephone buttons) is given below.
Input: Digit string “23”
Output: [“ad”, “ae”, “af”, “bd”, “be”, “bf”, “cd”, “ce”, “cf”]
Assumptions:
The digits are between 2 and 9
欲知答案如何?请见下期分解!
Day 212 答案揭晓
DS Interview Questions
What is Principal Component Analysis? What are its applications and limitations?Suppose we want to analyze a dataset with n observations on a set of p features. When p is very large, it is likely that none of the features alone will be informative since each just contain a very small fraction of the total information. Each of the n observations lives in p-dimensional space, but these p dimensions are not equally interesting. PCA seeks to find a small number ofdimensions that are as interesting as possible, where the concept of
‘interesting’ is measured by the amount that the observations vary along each dimension.
Each of the dimensions found by PCA is a linear combination of the p features. For instance, the first principal component is:
subject to
The vecto defines a direction in feature space along which the data varythe most. If we project the n data points onto this direction, the projected values are the principal component scores.
Applications: we can adapt regression, classification, and clustering methods by using the first K<<p principal component score vectors as features, which will lead to much less noisy results. Other applications include data compression: for example, we can take the first few principal components of image data to compress image files.
Limitations:
Sometime the variance of the data may not be a good measurement of our interest of the data. For instance, if we want to cluster the following data by color:
PCA will result in the purple line as the projection line while reduced-rank LDA will result in the green line as the projection line. It is easy to see that if we project our data according to PCA, the data will not be well separated at all, while if we project our data by LDA, clustering is clear.
BA Interview Questions
R language:
write a while() loop that prints the variable, “i“, that is incremented from 2 – 5, and uses the next statement, to skip the printing of the number 3.
** The next statement is used within loops in order to skip the current evaluation, and instead proceed to the next evaluation
i <- 1
while(i<5){
i=i+1
if(i==3) next
print(i)
}
Leetcode Questions
Description:
Write a function to find the longest common prefix string amongst an array of strings.
Input:[“aasdfgas”, “aaasafda”]
Output:“aa”
Solution:
多种解法
最巧妙地是排序之后比较首尾
二分也可以通过测试
Code:
Time Complexity: O (nlog(n))
点击“阅读原文”查看数据应用学院核心课程