本文最后更新于：几秒前

1、Machine Learning & Neural Networks (8 points)

（a）Adam Optimizer

（i）

$m$通过在每次更新时将历史方向和梯度方向进行比较，同向则矢量相加加快收敛速度，异向则减缓更新速度，这种含动量的更新方向可以减少梯度更新的动荡，这种低方差有助于保持梯度下降的效率。导致更快的收敛。

（ii）

具有更少更新历史的模型参数将获得更大的更新。这规范化了更新步骤，避免超调或单调递减的学习速率，适度调整了学习率的大小

（b）Dropout

（i）

$d\odot h$在$1-p_{drop}$的比例下降低了隐向量的规模，为了将其恢复到原规模，$\gamma = \frac{1}{1-p_{drop} }$

（ii）

训练时使用dropout可以提高训练模型的鲁棒性，防止过拟合现象发生，而评估时没有这个需要。

2. Neural Transition-Based Dependency Parsing (46 points)

（a）

Stack	Buffer	New dependency	Transition
[ROOT]	[I,attend,lectures,in,the,NLP,class]		Initial Configuration
[ROOT,I]	[attend,lectures,in,the,NLP,class]		SHIFT
[ROOT,I,attend]	[lectures,in,the,NLP,class]		SHIFT
[ROOT,attend]	[lectures,in,the,NLP,class]	attend->I	LEFT-ARC
[ROOT,attend,lectures]	[in,the,NLP,class]		SHIFT
[ROOT,attend]	[in,the,NLP,class]	attend->lectures	RIGHT-ARC
[ROOT,attend,in]	[the,NLP,class]		SHIFT
[ROOT,attend,in,the]	[NLP,class]		SHIFT
[ROOT,attend,in,the,NLP]	[class]		SHIFT
[ROOT,attend,in,the,NLP,class]	[]		SHIFT
[ROOT,attend,in,the,class]	[]	class->NLP	LEFT-ARC
[ROOT,attend,in,class]	[]	class->the	LEFT-ARC
[ROOT,attend,class]	[]	class->in	LEFT-ARC
[ROOT,attend]	[]	attend->class	RIGHT-ARC
[ROOT]	[]	ROOT->attend	RIGHT-ARC
[ROOT]	[]		Decline

（b）

$2n$，每个词都执行一个SHIFT，并作为关系尾执行一次LEFT-ARC或RIGHT-ARC

（c）

class PartialParse(object):
    def __init__(self, sentence):
        """Initializes this partial parse.

        @param sentence (list of str): The sentence to be parsed as a list of words.
                                        Your code should not modify the sentence.
        """
        # The sentence being parsed is kept for bookkeeping purposes. Do NOT alter it in your code.
        self.sentence = sentence

        ### YOUR CODE HERE (3 Lines)
        ### Your code should initialize the following fields:
        ###     self.stack: The current stack represented as a list with the top of the stack as the
        ###                 last element of the list.
        ###     self.buffer: The current buffer represented as a list with the first item on the
        ###                  buffer as the first item of the list
        ###     self.dependencies: The list of dependencies produced so far. Represented as a list of
        ###             tuples where each tuple is of the form (head, dependent).
        ###             Order for this list doesn't matter.
        ###
        ### Note: The root token should be represented with the string "ROOT"
        ### Note: If you need to use the sentence object to initialize anything, make sure to not directly 
        ###       reference the sentence object.  That is, remember to NOT modify the sentence object. 

        self.stack = ['ROOT']
        self.buffer = sentence
        self.dependencies = []
        ### END YOUR CODE


    def parse_step(self, transition):
        """Performs a single parse step by applying the given transition to this partial parse

        @param transition (str): A string that equals "S", "LA", or "RA" representing the shift,
                                left-arc, and right-arc transitions. You can assume the provided
                                transition is a legal transition.
        """
        ### YOUR CODE HERE (~7-12 Lines)
        ### TODO:
        ###     Implement a single parsing step, i.e. the logic for the following as
        ###     described in the pdf handout:
        ###         1. Shift
        ###         2. Left Arc
        ###         3. Right Arc
        if transition == "S":
            self.stack.append(self.buffer[0])
            self.buffer = self.buffer[1:]
        else:
            r = self.stack[-1]
            l = self.stack[-2]
            if transition == "LA":
                self.dependencies.append((r, l))
                self.stack = self.stack[:-2]
                self.stack.append(r)
            else :
                self.dependencies.append((l, r))
                self.stack = self.stack[:-1]
        ### END YOUR CODE

    def parse(self, transitions):
        """Applies the provided transitions to this PartialParse

        @param transitions (list of str): The list of transitions in the order they should be applied

        @return dependencies (list of string tuples): The list of dependencies produced when
                                                        parsing the sentence. Represented as a list of
                                                        tuples where each tuple is of the form (head, dependent).
        """
        for transition in transitions:
            self.parse_step(transition)
        return self.dependencies

CS224N作业A3：依存分析

1、Machine Learning & Neural Networks (8 points)

（a）Adam Optimizer

（i）

（ii）

（b）Dropout

（i）

（ii）

2. Neural Transition-Based Dependency Parsing (46 points)

（a）

（b）

（c）

（d）

（e）

（f）

（i）

（ii）

（iii）

（iv）

（g）