久久久久久精品无码人妻_青春草无码精品视频在线观_无码精品国产VA在线观看_国产色无码专区在线观看

CS5012代做、代寫Python設(shè)計(jì)程序

時(shí)間:2024-03-03  來源:  作者: 我要糾錯(cuò)



CS5012 Mark-Jan Nederhof Practical 1
Practical 1: Part of speech tagging:
three algorithms
This practical is worth 50% of the coursework component of this module. Its due
date is Wednesday 6th of March 2024, at 21:00. Note that MMS is the definitive source
for deadlines and weights.
The purpose of this assignment is to gain understanding of the Viterbi algorithm,
and its application to part-of-speech (POS) tagging. The Viterbi algorithm will be
related to two other algorithms.
You will also get to see the Universal Dependencies treebanks. The main purpose
of these treebanks is dependency parsing (to be discussed later in the module), but
here we only use their part-of-speech tags.
Getting started
We will be using Python3. On the lab (Linux) machines, you need the full path
/usr/local/python/bin/python3, which is set up to work with NLTK. (Plain
python3 won’t be able to find NLTK.)
If you run Python on your personal laptop, then next to NLTK (https://www.
nltk.org/), you will also need to install the conllu package (https://pypi.org/
project/conllu/).
To help you get started, download gettingstarted.py and the other Python
files, and the zip file with treebanks from this directory. After unzipping, run
/usr/local/python/bin/python3 gettingstarted.py. You may, but need not, use
parts of the provided code in your submission.
The three treebanks come from Universal Dependencies. If you are interested,
you can download the entire set of treebanks from https://universaldependencies.
org/.
1
Parameter estimation
First, we write code to estimate the transition probabilities and the emission probabilities of an HMM (Hidden Markov Model), on the basis of (tagged) sentences from
a training corpus from Universal Dependencies. Do not forget to involve the start-ofsentence marker ⟨s⟩ and the end-of-sentence marker ⟨/s⟩ in the estimation.
The code in this part is concerned with:
• counting occurrences of one part of speech following another in a training corpus,
• counting occurrences of words together with parts of speech in a training corpus,
• relative frequency estimation with smoothing.
As discussed in the lectures, smoothing is necessary to avoid zero probabilities for
events that were not witnessed in the training corpus. Rather than implementing a
form of smoothing yourself, you can for this assignment take the implementation of
Witten-Bell smoothing in NLTK (among the implementations of smoothing in NLTK,
this seems to be the most robust one). An example of use for emission probabilities is
in file smoothing.py; one can similarly apply smoothing to transition probabilities.
Three algorithms for POS tagging
Algorithm 1: eager algorithm
First, we implement a naive algorithm that chooses the POS tag for the i-th token
on the basis of the chosen (i − 1)-th tag and the i-th token. To be more precise, we
determine for each i = 1, . . . , n, in this order:
tˆi = argmax
ti
P(ti
| tˆi−1) · P(wi
| ti)
assuming tˆ0 is the start-of-sentence marker ⟨s⟩. Note that the end-of-sentence marker
⟨/s⟩ is not even used here.
Algorithm 2: Viterbi algorithm
Now we implement the Viterbi algorithm, which determines the sequence of tags for a
given sentence that has the highest probability. As discussed in the lectures, this is:
tˆ1 · · ·tˆn = argmax
t1···tn
 Yn
i=1
P(ti
| ti−1) · P(wi
| ti)
!
· P(tn+1 | tn)
2
where the tokens of the input sentence are w1 · · ·wn, and t0 = ⟨s⟩ and tn+1 = ⟨/s⟩ are
the start-of-sentence and end-of-sentence markers, respectively.
To avoid underflow for long sentences, we need to use log probabilities.
Algorithm 3: individually most probable tags
We now write code that determines the most probable part of speech for each token
individually. That is, for each i, computed is:
tˆi = argmax
ti
X
t1···ti−1ti+1···tn
 Yn
i=1
P(ti
| ti−1) · P(wi
| ti)
!
· P(tn+1 | tn)
To compute this effectively, we need to use forward and backward values, as discussed
in the lectures on the Baum-Welch algorithm, making use of the fact that the above is
equivalent to:
tˆi = argmax
ti
P
t1···ti−1
Qi
k=1 P(tk | tk−1) · P(wk | tk)

·
P
ti+1···tn
Qn
k=i+1 P(tk | tk−1) · P(wk | tk)

· P(tn+1 | tn)
The computation of forward values is very similar to the Viterbi algorithm, so you
may want to copy and change the code you already had, replacing statements that
maximise by corresponding statements that sum values together. Computation of
backward values is similar to computation of forward values.
See logsumexptrick.py for a demonstration of the use of log probabilities when
probabilities are summed, without getting underflow in the conversion from log probabilities to probabilities and back.
Evaluation
Next, we write code to determine the percentages of tags in a test corpus that are
guessed correctly by the above three algorithms. Run experiments for the training
and test corpora of the three included treebanks, and possibly for treebanks of more
languages (but not for more than 5; aim for quality rather than quantity). Compare
the performance of the three algorithms.
You get the best experience out of this practical if you also consider the languages of
the treebanks. What do you know (or what can you find out) about the morphological
and syntactic properties of these languages? Can you explain why POS tagging is more
difficult for some languages than for others?
3
Requirements
Submit your Python code and the report.
It should be possible to run your implementation of the three algorithms on the
three corpora simply by calling from the command line:
python3 p1.py
You may add further functionality, but then add a README file to explain how to run
that functionality. You should include the three treebanks needed to run the code, but
please do not include the entire set of hundreds of treebanks from Universal
Dependencies, because this would be a huge waste of disk space and band
width for the marker.
Marking is in line with the General Mark Descriptors (see pointers below). Evidence of an acceptable attempt (up to 7 marks) could be code that is not functional but
nonetheless demonstrates some understanding of POS tagging. Evidence of a reasonable attempt (up to 10 marks) could be code that implements Algorithm 1. Evidence
of a competent attempt addressing most requirements (up to 13 marks) could be fully
correct code in good style, implementing Algorithms 1 and 2 and a brief report. Evidence of a good attempt meeting nearly all requirements (up to 16 marks) could be
a good implementation of Algorithms 1 and 2, plus an informative report discussing
meaningful experiments. Evidence of an excellent attempt with no significant defects
(up to 18 marks) requires an excellent implementation of all three algorithms, and a
report that discusses thorough experiments and analysis of inherent properties of the
algorithms, as well as awareness of linguistic background discussed in the lectures. An
exceptional achievement (up to 20 marks) in addition requires exceptional understanding of the subject matter, evidenced by experiments, their analysis and reflection in
the report.
Hints
Even though this module is not about programming per se, a good programming style
is expected. Choose meaningful variable and function names. Break up your code into
small functions. Avoid cryptic code, and add code commenting where it is necessary for
the reader to understand what is going on. Do not overengineer your code; a relatively
simple task deserves a relatively simple implementation.
You cannot use any of the POS taggers already implemented in NLTK. However,
you may use general utility functions in NLTK such as ngrams from nltk.util, and
FreqDist and WittenBellProbDist from nltk.
4
When you are reporting the outcome of experiments, the foremost requirement is
reproducibility. So if you give figures or graphs in your report, explain precisely what
you did, and how, to obtain those results.
Considering current class sizes, please be kind to your marker, by making their task
as smooth as possible:
• Go for quality rather than quantity. We are looking for evidence of understanding
rather than for lots of busywork. Especially understanding of language and how
language works from the perpective of the HMM model is what this practical
should be about.
• Avoid Python virtual environments. These blow up the size of the files that
markers need to download. If you feel the need for Python virtual environments,
then you are probably overdoing it, and mistake this practical for a software
engineering project, which it most definitely is not. The code that you upload
would typically consist of three or four .py files.
• You could use standard packages such as numpy or pandas, which the marker will
likely have installed already, but avoid anything more exotic. Assume a version
of Python3 that is the one on the lab machines or older; the marker may not
have installed the latest bleeding-edge version yet.
• We strongly advise against letting the report exceed 10 pages. We do not expect
an essay on NLP or the history of the Viterbi algorithm, or anything of the sort.
• It is fine to include a couple of graphs and tables in the report, but don’t overdo
it. Plotting accuracy against any conceivable hyperparameter, just for the sake
of producing lots of pretty pictures, is not what we are after.
請(qǐng)加QQ:99515681  郵箱:99515681@qq.com   WX:codehelp 

標(biāo)簽:

掃一掃在手機(jī)打開當(dāng)前頁
  • 上一篇:代做CS252編程、代寫C++設(shè)計(jì)程序
  • 下一篇:AcF633代做、Python設(shè)計(jì)編程代寫
  • 無相關(guān)信息
    昆明生活資訊

    昆明圖文信息
    蝴蝶泉(4A)-大理旅游
    蝴蝶泉(4A)-大理旅游
    油炸竹蟲
    油炸竹蟲
    酸筍煮魚(雞)
    酸筍煮魚(雞)
    竹筒飯
    竹筒飯
    香茅草烤魚
    香茅草烤魚
    檸檬烤魚
    檸檬烤魚
    昆明西山國家級(jí)風(fēng)景名勝區(qū)
    昆明西山國家級(jí)風(fēng)景名勝區(qū)
    昆明旅游索道攻略
    昆明旅游索道攻略
  • 短信驗(yàn)證碼平臺(tái) 理財(cái) WPS下載

    關(guān)于我們 | 打賞支持 | 廣告服務(wù) | 聯(lián)系我們 | 網(wǎng)站地圖 | 免責(zé)聲明 | 幫助中心 | 友情鏈接 |

    Copyright © 2025 kmw.cc Inc. All Rights Reserved. 昆明網(wǎng) 版權(quán)所有
    ICP備06013414號(hào)-3 公安備 42010502001045

    久久久久久精品无码人妻_青春草无码精品视频在线观_无码精品国产VA在线观看_国产色无码专区在线观看

    美女黄色片网站| 国产成人无码av在线播放dvd| 无码人妻少妇伦在线电影| 成人免费看片视频在线观看| 日韩欧美视频免费在线观看| 日韩成人三级视频| 国产又粗又猛大又黄又爽| 九一免费在线观看| 中文字幕亚洲欧洲| 日本一本在线视频| 色婷婷成人在线| 超碰超碰超碰超碰超碰| 中文字幕第38页| 成人黄色一区二区| 久草资源站在线观看| 天天干天天草天天| 香蕉视频禁止18| 国产九色porny| 青青草免费在线视频观看| 国产性生活一级片| 国产av麻豆mag剧集| www.涩涩涩| 精品国产成人av在线免| 在线观看中文av| 亚洲制服中文字幕| 在线免费黄色小视频| 亚洲在线观看网站| 欧美一级黄色录像片| 欧美日韩亚洲国产成人| 日本一区二区三区四区五区六区| 手机精品视频在线| 99re99热| 国产精品视频黄色| 日本黄色a视频| 欧美午夜精品理论片| 亚洲网中文字幕| 色爽爽爽爽爽爽爽爽| 国产精品视频一二三四区| 成人av在线播放观看| 人人干视频在线| 人妻内射一区二区在线视频| www日韩视频| 99re6在线观看| 91极品尤物在线播放国产| 爱情岛论坛亚洲首页入口章节| 九九热99视频| 欧美精品一区二区性色a+v| 妞干网在线播放| 日韩av在线综合| 中文字幕第88页| 国内自拍中文字幕| 丝袜老师办公室里做好紧好爽| 大香煮伊手机一区| 天堂av2020| 蜜臀av色欲a片无码精品一区| 男人天堂999| av亚洲天堂网| 福利视频免费在线观看| 久久综合伊人77777麻豆最新章节| 在线免费观看av网| 久久99中文字幕| 中文字幕一区二区三区四区在线视频| 久久久精品高清| 成人免费性视频| 亚洲天堂网一区| 日本黄色片一级片| 在线观看的毛片| 大陆极品少妇内射aaaaaa| 成人免费毛片播放| 亚洲精品天堂成人片av在线播放 | 男人天堂999| 日韩av一卡二卡三卡| 国产亚洲黄色片| 中文字幕永久有效| 国产美女主播在线| 久久久久久久高清| 18岁网站在线观看| 成 年 人 黄 色 大 片大 全| 国产精品视频网站在线观看| 国产第一页视频| 毛片在线视频观看| 国产三级国产精品国产专区50| 欧美交换配乱吟粗大25p| 日本888xxxx| 久激情内射婷内射蜜桃| 尤物网站在线看| 男人的天堂日韩| 少妇人妻大乳在线视频| 久久婷婷中文字幕| 粉嫩虎白女毛片人体| 搞av.com| 经典三级在线视频| 中文字幕亚洲乱码| av黄色在线网站| 免费cad大片在线观看| 黄色小视频免费网站| 国产主播在线看| 欧美极品少妇无套实战| 欧美日韩理论片| 亚洲性生活网站| 免费观看日韩毛片| 妞干网在线播放| 日韩精品福利片午夜免费观看| 色婷婷成人在线| 成人一级片网站| 黄页网站在线观看视频| 国产91在线亚洲| 在线观看视频黄色| 亚洲欧美日韩精品一区| 成年人在线观看视频免费| 一区二区传媒有限公司| 野外做受又硬又粗又大视频√| 超碰中文字幕在线观看| 狠狠干狠狠操视频| 日本激情综合网| 成人黄色一区二区| 久久久久免费精品| 免费在线观看亚洲视频| 精品无码一区二区三区在线| 国产免费一区二区视频| 路边理发店露脸熟妇泻火| 亚洲高清在线不卡| 涩多多在线观看| 久久久久久久久久毛片| 亚洲天堂国产视频| 久久精品亚洲天堂| 尤物网站在线看| 精品久久免费观看| 公共露出暴露狂另类av| 米仓穗香在线观看| 欧美国产综合在线| 成人一区二区免费视频| 你懂的av在线| 国产在线青青草| 日本爱爱免费视频| jizzzz日本| 特黄特黄一级片| 亚洲色图都市激情| 激情小视频网站| 欧美国产亚洲一区| 国产第一页视频| 9l视频白拍9色9l视频| 欧美成人手机在线视频| 午夜福利123| 久久这里只有精品8| 男人天堂a在线| 亚洲熟妇av日韩熟妇在线| 日韩av播放器| 日韩成人精品视频在线观看| 国产麻豆电影在线观看| 欧美视频在线观看视频 | 四虎永久在线精品无码视频| 9久久婷婷国产综合精品性色| 五月激情五月婷婷| 99久re热视频精品98| www.av蜜桃| 美女网站免费观看视频| 午夜av中文字幕| av 日韩 人妻 黑人 综合 无码| 天堂…中文在线最新版在线| 日韩av播放器| 国产九九九视频| 高清欧美精品xxxxx| 91视频免费版污| 成人免费黄色av| 成人性生活视频免费看| 欧美三级理论片| 在线观看免费黄色片| 免费毛片小视频| 久久久久xxxx| 妞干网在线视频观看| 欧美婷婷精品激情| 成人午夜免费在线视频| 日韩 欧美 高清| 日本三日本三级少妇三级66| 国产最新免费视频| 免费黄频在线观看| 欧美日韩二三区| 成年人网站av| 欧美成人xxxxx| 在线无限看免费粉色视频| 国产精品97在线| 樱空桃在线播放| 久久国产色av免费观看| 黄色免费高清视频| 老头吃奶性行交视频| 成人黄色片免费| 亚洲小视频网站| 日韩免费视频播放| 香蕉精品视频在线| 精品免费国产一区二区| 日产精品久久久久久久蜜臀| the porn av| 狠狠97人人婷婷五月| 一二三在线视频| 美女在线视频一区二区| 日本三级免费网站| 97中文字幕在线| 一区二区三区四区久久| 日韩av片网站|