RSS Feed
  1. 2015年总结

    December 22, 2015 by xudifsd

    2015 年就要过去了,差不多又要写点总结了。

    专业

    自从高中那时候开始就非常喜欢编程,而且很幸运的这兴趣能一直持续到现在并很有可能继续下去。

    今年专业上一个比较大的变化就是开始经常性地主动找些论文读了。本科的时候读了下 Google 那三篇著名的论文,但是并没有很深入的读。年初在刷题的时候看到一个 FB 工程师的关于最近读了哪些论文的总结,当时就想确实可以通过坚持读些论文来了解下业界的情况。由于喜欢分布式,而且准备以后做分布式系统,所以之后就开始培养自己读论文的习惯:专门收集一些分布式相关的论文在闲暇时读下,虽然没法实际实现,但是可以知道一些很有意思的方法以及抽象。

    五月的时候去阿里云实习了一段时间,做的是类 GFS 系统,还是第一次接触真正的分布式系统,看着程序运行在几千台机器上还是觉得很兴奋的。由于是用 C++ 写的,所以就利用带薪编译时间(笑)读些之前收集的论文。

    今年毕业,要写毕业论文,和本科毕业一样,还是想做自己喜欢做的事。本科时绝大多数人都是做 “XX管理系统”,而我由于那时候非常喜欢 Lisp,所以做了个很简单的 Lisp 解释器。今年的毕业论文也写编译相关的符号执行引擎,而且毕业论文又提到了 Lisp (是真爱)。

    健康

    回学校做毕业答辩的时候和一个同学聊天,说他有次低头太久,晕了,然后在医院做了一个月的颈椎牵引治疗,医生还跟他说换个工作吧。我听到之后的唯一感觉就是 panic ,我那时候我也开始感觉自己的颈椎有点痛,而且几年前在豆瓣认识一个程序员,说自己做了两年 iOS 开发,然后因为腰的问题医生让他换个职业。听我同学这么一说,觉得必须要定期锻炼。这一年因为毕业、刷题、找工作等,很多锻炼都被忽略,经常是除了睡觉就看着电脑。很多人说程序员命短,之前也就是笑笑,觉得离自己很遥远,但是现在必须开始注意。

    人在 23 岁的时候智商、体能都到达顶峰,之后的日子则开始下降,所以之后自己的健康完全没法依赖生长来保证。我可不希望听医生说“换个工作吧”,所以以后不管再忙都需要保证一周至少10小时的运动量,而且除了工作、学习之外不要能再坐在电脑前,少用手机,尽可能多去户外,远离网络生活。

    总结

    总的来说今年有些事情过得不顺心,有时觉得还挺艰苦,能回想起的最开心的事情竟然就只是去美国玩了一趟,除此之外再也想不起其他有意思的事了。今年还是第一年让我感觉这么不舒服,有时甚至觉得有些抑郁,可能也是这个原因让我越来越喜欢安静的地方和歌曲,比如已经听了好几年的《Wake Me Up When September Ends》和最近喜欢上的《True Colors》和《Hello Alone》。

    虽然有些不顺和抑郁,但是现在回想起来并不算太坏,因为还是很害怕那种一直一帆风顺的人生,还是需要偶尔来些棘手的问题经常锻炼才能培养处世不惊和调节心态的能力,而且我相信只有在走下坡路的时候才会觉得轻松。

    之后的一段日子应该是一个人最黄金的日子了:没有家庭的负担,自己也有足够的能力和见识去建造自己的未来。好好享受吧。


  2. note on Justice

    December 4, 2015 by xudifsd

    Although I love programming and make it as my profession, I still found studying philosophy of right interesting.

    While idling at home, I watched <Justice> for the 3rd times, so this post is served as my notes on it.

    classification of principle of justice

    There are two kind of principle to reason what is right thing to do:

    • consequential
    • categorical

    consequential way of justice judges an action based on the result produced by it, whereas categorical way of justice judges an action not by result but by categories the action belongs to, so from this point of view, murder is murder, no matter whether it is for saving other lives or for money.

    Utilitarian is considered as consequential, because its underlying principle is “to maximize the utility of the whole society“, this principle is focusing on the result produced by an action, and say nothing about the categories that action belongs to. This principle, however, failed to respect individual rights. So libertarian do not like utilitarian for its dehumanization.

    Libertarian

    For libertarians, which I think I’m one of them, individual rights should not be violated either by other individuals or by government.

    John Lock is considered as a libertarian, but he is a little bit different from libertarian in a sense that he think “right of life, liberty and property are unalienable“, and since it’s unalienable, we can not freely trade them.

    There’s a tricky part of the Lock’s theory: since right of life, liberty and property are unalienable, how does government can tax people? Well, Lock says it is possible for the majority of people to agree on a general procedure not an arbitrary procedure to decide how to tax.

    Kant (freedom as autonomous)

    Kant reject utilitarian, he think that the individual person, all human beings have a certain dignity that commands our respect. The reason the individual is sacred or the bearer of rights doesn’t stem from the idea that we own ourselves but instead from the idea that we are all rational beings and autonomous beings. And he deny that pain and pleasure are our sovereign masters, he thinks that it’s our rational capacity that makes us distinctive, it makes us something more than just physical creatures with appetites.

    And Kant have a special and interesting idea about freedom:

    When we, like animals, seek after pleasure or the satisfaction of our desires or the avoidance of pain, when we do that, we aren’t really acting freely, because if we do so, we were acting as the slaves of those appetites and impulses, when I act to satisfy it, I’m just acting according to natural necessity. So for Kant, freedom is the opposite of necessity.

    So, to act freely is to act autonomously, and to act autonomously is to act according to a law I give myself, not according to the physical laws of nature or the laws of cause and effect. To act freely is not to choose the best means to a given end, it’s to choose the end itself for its own sake. Insofar as we act on inclination or pursue pleasure, we act as means to the realization of ends given outside us, we are instruments rather than authors of the purposes we pursue. Respecting human dignity means regarding persons not just as means but also as ends in themselves. And this is why it’s wrong to use people for the sake of other peoples’ well-being or happiness.

    I’m in love with above idea, actually I think I agreed with him many years ago even before I know Kant. Have you noticed that I used “Rational life” as my tagline of this blog? I used that subconsciously when I was building this blog site and never changed it. So there would be no double that I love Kant’s idea.

    What gives an action its moral worth? Consists not in the consequences or in the results that flow from it but in the motive, in the quality of the will.

    A good will isn’t good because of what it effects or accomplishes, it’s good in itself. Even if by utmost effort the good will accomplishes nothing it would still shine like jewel for its own sake as something which has its full value in itself. We should “act in such a way that you always treat humanity , whether in your own person or in the person of any other, never simply as a means, but always as the same time, as an end.

    Although Kant’s idea appeals me, I found it lacks something to make that idea a great guidance for making law, it didn’t provide a practical approach to do that.

    Rawls

    Rawls agree with Kant, and works “never happened contract” out with the device of what he calls the “veil of ignorance“. This, I think, provides the missing practical approach to serve as guidance to make law.

    A subproblem needs to be solved first is: what makes contract restrictive? The sign of the contract is not a sufficient condition of the agreement being fair. And an actual agreement is not a sufficient condition of there being an obligation.

    Actual contracts have their moral force in virtue of two distinguishable ideals: autonomy and reciprocity. But in real life every actual contract may fall short.

    One student says that we would choose a system based on merit behind the veil of ignorance, but professor rejected that idea, because the effort itself is also largely shaped by family and education environment. And here comes my favorite part of the lecture: professor did a very interesting poll on how many students in Harvard University are first in birth order, it turns out that the majority of people are first.

    The application of theory

    Theories of distributive justice:

    • Libertarian – free market system (against a background of formal equality) but this will in favor of those who happen to be born to affluent families.
    • Meritocratic – fair equality of opportunity (to bring everyone to the same start point of line)
    • Egalitarian – Rawls’ difference principle (people may gain from the lucky they have, but only on terms that work to the advantage of the least well off)

    Teleological reasoning

    If you look at a range of thinkers this lecture has been considering, there does seem to be a reason they want to detach justice from desert that goes well beyond any concern for equality. They all agree that justice is not a matter of rewarding or honoring virtue or moral desert.

    Somehow they think tying justice to moral merit or virtue is going to lead away from freedom, from respect for persons as free beings. But Aristotle is different.

    Aristotle think that justice means giving each person his or her due.

    For Kant and for Rawls, the point of politics is not to shape the moral character of citizens. It’s not to make us good. It’s to respect our freedom to choose our goods, our values, our ends consistent with a similar liberty for others. But Aristotle disagree: “Any polis which is truly so called, and is not merely one in name, must devote itself to the end of encouraging goodness. Otherwise, political association sinks into a mere alliance.

    Because in pluralist societies people would disagree about the nature of the good life, we shouldn’t try to base justice on any particular answer to that question. So Kant and Rawls reject teleology, they reject the idea of tying justice to some conception of the good. What’s at stake in the debate about teleology:

    If you tie justice to a particular conception of the good, if you see justice as a matter of fit between a person and his or her roles, you don’t leave room for freedom, and to be free is to be independent of any particular roles or traditions or conventions that may be handed down by my parents or my society.

    Well, I think this is why I do not like some parts of Chinese culture, it seems they are actually promoting some goods or ends and forcing us to do something towards those them, failed to provide freedom to chose. So I do agree that the best law is not the one that promotes certain goods, but provides a fair framework for people to achieve their own ends while respecting individual rights.

    Conclusion

    Kant and Rawls failed to provide ideas about obligations of membership, loyalty, and how to interact with other people. Kant is focusing on individual, and provides a set of way to live a rational life and respect people at the same time. And since I love his idea so much, his book <Groundwork of the Metaphysic of Morals> should be appended to the list of books I must read.

    And in a sense that lacking respect for freedom to choose, the idea that law should promote certain goodness should be rejected. The only functionality of the law is to provide a framework for people lived in to achieve their own ends.

    In later part of the lecture, the professor talked about how to respect fellow citizen which I found intriguing:

    For libertarians, how to respect our fellow citizens’ moral and religious is just to ignore them. But that isn’t the only way even the most plausible way to do so. There is a different conception of respect according to which we respect our fellow citizens’ moral and religious convictions, not by ignoring, but by engaging them. By attending to them, sometimes by challenging and contesting them, sometimes by listening and learning from them. There’s no guarantee that this will lead to agreement, there’s no guarantee it will lead to appreciation for the moral and religious convictions of others. But compare to ignoring others, the respect of deliberation and engagement is more adequate, more suitable ideal for pluralist society. This kind of moral engagement will better enable us to appreciate the distinctive goods our different lives expressed.


  3. 子序列中的滑动窗口

    September 28, 2015 by xudifsd

    很多子序列题都可以用滑动窗口解决,这个方案适用于需要找一个子序列的开始和结束位置。注意,这里指的子序列类似于子字符串,而非subsequence

    例如 leetcode 的 Substring with Concatenation of All Words 。最简单的方法就是找到一个开始位置,然后判断这个位置是否可行,代码如下:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    #!/usr/bin/env python
    
    class Solution:
        def findSubstring(self, s, words):
            if len(words) == 0 or len(s) == 0:
                return []
            dic = {}
            count = 0
            for w in words:
                dic[w] = dic.get(w, 0) + 1
                count += 1
            lw = len(words[0])
            lws = lw * len(words)
            result = []
            for i in xrange(len(s) - lws + 1):
                if self.possible(s, lw, i, dic, count):
                    result.append(i)
            return result
    
        def possible(self, s, lw, i, dic, count):
            dic2 = dic.copy()
            if count == 0:
                return True
            while count > 0:
                cur = s[i:i+lw]
                if dic2.get(cur) != None and dic2.get(cur) > 0:
                    dic2[cur] -= 1
                    if dic2[cur] < 0:
                        return False
                    count -= 1
                    i += lw
                else:
                    return False
            return True
    

    但是时间复杂度度也很高,有O(n2),这里面有很多冗余计算,比如对于输入

    “aabbccaadd”, [“aa”, “bb”, “cc”, “dd”]

    就会处理 “bbccaa” 两次。最好的办法是使用滑动窗口:维护开始和结束位置,一旦开始和结束位置一样就退出循环,否则在可行的情况下增加结束位置,如果不可行则增加开始位置。

    滑动窗口可以将复杂度降低到O(n)。在这一题中,由于字典中的字符串都是同样大小,所以可以以这个大小作为步长移动。但是需要添加另一个循环来覆盖以不同的偏移开始计算的情况。代码如下:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    #!/usr/bin/env python
    
    class Solution(object):
        def findSubstring(self, s, words):
            if len(s) == 0 or len(words) == 0:
                return []
            m = {}
            count = 0
            for w in words:
                m[w] = m.get(w, 0) + 1
                count += 1
            lw = len(words[0])
            result = []
            for delta in xrange(lw):
                i = delta
                while i < len(s) - lw + 1:
                    key = s[i:i + lw]
                    if m.get(key) > 0:
                        m[key] -= 1
                        count -= 1
                        j = i + lw
                        while i < j:
                            if count == 0:
                                result.append(i)
                            if j < len(s) - lw + 1:
                                key = s[j:j + lw]
                                if m.get(key) > 0:
                                    m[key] -= 1
                                    count -= 1
                                    j += lw
                                    continue
                            m[s[i:i + lw]] += 1
                            count += 1
                            i += lw
                    else:
                        i += lw
            return result
    

    其中,for循环用来处理不同偏移的情况,for循环中的循环用于实现滑动窗口。这个实现复杂度为O(lw * n),如果将lw看成常数则复杂度为O(n)。

    滑动窗口适用于很多子序列类的算法,如:Minimum Size Subarray SumMinimum Window SubstringLongest Substring Without Repeating Characters


  4. Make a predictable whole out of unpredictable parts

    August 8, 2015 by xudifsd

    Recently, I read many papers & articles about latency, especially tail latency, so this post is served as my notes on my reading.

    Latency is a very important metric for a system, because it affects user experience a lot. In distributed system, tail latency is much more important than in single node system. This is based on a simple observation:

    if you have components that 1% of requests exhibit high latency, and if request from client have to touch 100 components in a distributed system, then you’ll have 64% of client requests exhibit high latency. That’s awful user experience. This is a fundamental property of scaling systems: you need to worry not just about latency, but tail latency. High performance equals high tolerances. At scale you can’t ignore tail latency.

    From this article, we can know that latency can arise from many sources, from hardware to software. It also links to interesting discussion on TCP vs SPDY.

    Authors of another paper discussed their effort on tuning web server software, they found out that head-of-line blocking in kernel queue has great impact on latency. This kind of blocking is sometime helpful for throughput on disk write, but can severely degrades latency. Also, if you have blocking invocation in an event loop, it will also causes head-of-line blocking, and will makes tail latency higher, by removing blocking invocation out of event loop, we can solve this kind of problem.

    Another article from linkedin discussed how they solve high tail latency by installing another net card to serve specific network request.

    Two examples above show how difficult it is to identify the source of latency. As always, optimization requires a lot of insight on how system works.

    Another interesting paper proposed a runtime coordination system that coordinate GC in distributed system written in GC language. GC usually causes hiccup for system, especially major collection. This paper shows that minor GC have little impact on batch workload like Spark, but will affect real-time application significantly, GC is a key contributor to stragglers in many interactive system. By deploying coordinate system, PageRank computation on Spark completed 15% faster, 99.9% tail update latency on Cassandra is improved from 3.3ms to 1.6ms, the worst case from 83ms to 19ms, which is quite impressive.

    High tail latency is not only caused by GC, head-of-line blocking etc, it’s prevalent. This paper shows that 1% tail latency is unpredictable no matter what config, programming language or OS is. After removing 1% tail latency from statistic, the server’s behavior is predictable. So that, many systems could focusing on provide QoS guarantees for statistical measures such as 99th latency percentile.

    Ok, enough talk about background, let’s read some really awesome stuff.

    Jeff Dean gives a talk about tail latency in 2013. Here is the slide and summary article. Instead of trying to identify the source of latency & eliminate it, he tries to live with it. The talk gives some general techniques to cut the tail, like: hedged requests, tied requests, micro-partitions, selective replication and latency-induced probation. These techniques is general enough to applied to many scenarios.

    The talk compared faults tolerance and variability tolerance. To tolerate variability, one has to:

    make a predictable whole out of unpredictable parts

    and this is where the title of this post came from. It’s very similar to:

    make a reliable whole out of unreliable parts

    which is the key point of distributed system. And since latency is more and more important as a metric, predictable latency should also be more and more important. I think this is the direction that next age distributed system should heading.


  5. Cope with high latency caused by fork(2)

    July 8, 2015 by xudifsd

    In normal applications, we won’t care much about latency caused by non-IO system calls, because they’re usually very fast, and this is more true for fork(2), since fork(2) in Linux is much more lightweight than most other OSs.

    But we met this very problem. The application we’re talking about is master node server for distributed cluster, we need to fork to make checkpoint of our internal data structures, the checkpoint can speed up recovery phase if server is down and up again. Because the scale of cluster is very large, application in master node will consume a lot of memory, in our production cluster, we have 200+G memory machine, and the application consumed almost 200G of memory, this will causes the fork(2) exhibit high latency, almost 10s as we observe, this has also been documented by redis.

    Linux deployed copy-on-write in fork(2), but it still needs to copy page table, for 200G memory, it constituted by 200G/4K == 50M pages, and since 1 page can have 4K/8 == 512 PTEs, so PTEs alone occupied 50M/512 == 100K pages, that’s 400M of memory, and kernel is not just copying that 400M of memory, it also needs to do some bookkeeping.

    Alought Linux introduced lazy copy of page table in 2.6.14, but this laziness only works for pages backed by file, which is not applicable in our cases.

    Because fork(2) will also causes other threads pause during the call. This latency is unacceptable for some threads, especially for those are sensible to time, for example, threads that are responsible for sending/receiving heartbeat.

    If we don’t sending or responding heartbeat for a long time, other nodes will think we’re down, and if this server is primary master, quorum may run next election to elect a new primary. To make it worse, the number of ‘running’ server may be less than quorum, this will makes server completely not responding during that time.

    Luckily enough, we only do checkpoint in secondary master, instead of primary master, so we don’t need to deal with election problem. But since we usually have only 3 masters, we need to make sure there’re always 2 masters are responding, otherwise the server will not responding.

    Previously we workaround this by making heartbeat missing tolerant time longer than latency caused by these time consuming operations, but this will also makes failing node detecting much slower. Also we need to make sure 2 secondary masters are not doing fork(2) at the same time. Currently we notify the thread that we’re going to have a long time pause before we actually call fork(2), and notify it again after call, and in that thread, we pretend we received the heart beat just after the pause, but these’re all workaround, instead of real solution.

    IMHO, this fragile to latency is the design fault. I’ve test that forking with 30G memory takes 1s, so if your application is targeting at that scale, you should take fork(2) latency seriously. For our case, I think we may setup a process that is only responsible for reading the log outputted by master on the same machine, and generating checkpoint periodically, but this may consume twice of memory, we can treat this as trading space for time. Or we can doing checkpoint for one data structure only at the time, this strategy will not consume that much of memory, but will makes data structures more complicated.

    Well, I’m more and more convinced by

    it’s important to estimate performance of a system design without actually having to build it.

    To make this happend, programmer should be more experienced and more familiar with latency caused by normal operations. It’s still a long way for me to go.