Vol. 10, Issue 1, (issn: 2278-7720)



Yüklə 199,88 Kb.
Pdf görüntüsü
səhifə2/5
tarix08.06.2022
ölçüsü199,88 Kb.
#89132
1   2   3   4   5
CRITIQUE
) to detect the grammar errors and style weaknesses in English texts was proposed by 
Ravin (1998)[6] and was developed at IBM Thomas J. Watson Research Center. Style checking was performed only after 
the grammar checking. Parser containing about 200 phrase structure rules was used for syntactic analysis of sentences in the 
input. The style component of CRITIQUE had more than 300 phrase structure rules to detect style weaknesses. 
A grammar checker for Korean language was developed by Young-Soog (1998)[7]. Partial parsing was used to detect 
grammatical errors. Since Korean is a partially free word order language, therefore, the grammar used for parser is 
dependency grammar. A correction rule table was used to suggest corrections. In order to prevent excessive creation of 
candidate words for error replacement, this system used a high frequency word dictionary derived from corpus and part-of-
speech pattern. The system was reported to have achieved an average precision of 99.05% and an average recall of 95.98%. 
A grammar corrector for Danish was developed by Paggio (2000)[8]. A full parser was used to detect the errors. The 
grammar of the parser was an augmented context-free grammar consisting of rewrite rules where symbols were associated 
with features. It used error rules to detect structural errors. The error rules contained error messages and error weight 
associated with them. In this way, if a particular error rule detected an error in the text then it could show a useful message to 
help the user correct the error. When this system was tested against a test corpus (having grammatical errors mixed in 
randomly chosen text), this system reached 58.1% error coverage as compared to 53.5% for Microsoft Word, the precision 
reported was 20.6% for this system and 15.9% for Microsoft Word on that same test corpus.


www.ijcait.com International Journal of Computer Applications & Information Technology 
Vol. 10, Issue 1, (ISSN: 2278-7720)
P a g e
| 218
A commercial grammar checking system for integration in Microsoft Word for French, German, and Spanish languages was 
developed by Helfrich and Music (2000)[9] at Microsoft Corporation. This system excluded some obvious errors that are 
simple to detect but most users don’t bother about them. The authors presented the design process used to find out the 
features or errors that the grammar checker needed to cover and then in evaluation point out that how important it is to keep 
false alarms close to zero. They suggested the use of highly edited documents for testing to achieve false alarm count close 
to zero. As this system was commercial software, therefore, the inner details of the system’s working were not presented.
 
Another grammar checking system to be used as a part of project on developing a Computer Assisted Language Learning 
(CALL) system for French as a foreign language was developed by Vandeventer (2001)[10]. Government and binding 
theory based syntactic parsing system for French – FIPS was used to develop this system. This grammar checking system 
worked by relaxing three constraints for agreement – gender, number, and person. The parser used was based on chart 
parsing algorithm and returned partial parses in the form of chunks for the sentences that failed to go through complete 
analysis.
Another grammar checking system for Swedish was provided by Carlberger et al. (2002, 2004)[11]. This grammar checking 
system combined both the probabilistic and rule-based methods to achieve high efficiency and robustness. Error rules were 
used to detect various grammatical errors and to give suggestions. HMM (Hidden Markov Models) based part-of-speech 
tagger is used in this system. This system had 200 scrutinizing rules and 50 help rules.
A grammar checker to detect agreement error in noun phrase of German texts was developed by Fliedner (2002)[12]. A 
finite state automata based shallow parsing along with constraint relaxation was used to detect agreement errors in noun 
phrases. All the words in a noun phrase needed to agree in terms of number, gender and case. The precision and recall of this 
system was around 67%. 
A grammar checker for second language learners of Swedish was discussed by Kann (2002)[13] and Bigert et al. (2004)[14]. 
This system was an extension of Granska system developed by Carlberger et al. (2002)[16] and used a hybrid approach. In 
hybrid approach, a combination of three approaches – manually constructed error rules, based on POS trigram frequencies 
from a tagged corpus, and machine learning of automatically constructed errors was used. As all these approaches are 
focused on detecting fairly different set of errors, so a combination of these approaches gave better results.
 
A grammar checking system based on two pass parsing approach for Urdu was presented by Kabir et al. (2002)[18]. In the 
first pass, the sentence was parsed using basic phrase structure grammar rules and if it failed to get completely parsed, then 
movement rules were applied to convert the sentence into its desired base form and then reparsed to check for errors. 
Movement rules were used to convert the input sentence into the form recognized as base form. If the input sentence failed to 
get parsed in the first pass and also no movement rules could be applied then it meant that the structure of the sentence was 
probably incorrect, thus, a structural error was flagged. The phrase structure grammar rules were designed only for base 
structure forms or kernel sentences. Only simple declarative sentences in subject, object, and verb (SOV) order were taken 
into consideration for grammar checking. The grammatical errors covered by the system were disagreement in terms of 
number, gender and case, internal to noun phrases and between noun and verb phrases in a sentence. Structural errors 
covered were missing noun, missing verb phrase, misplaced adjective phrase etc. For any detected error, the system provided 
corrections and showed the final corrected output to the user. 
A purely rule-based open source grammar and style checker for English was discussed by Naber (2003). QTAG (a freely 
available probabilistic part-of-speech tagger for non-commercial use, described by Tufis and Manson (1998)) was used for 
part-of-speech tagging along with a rule-based module to help the tagger by eliminating some of the ambiguous tags before 
sending it to the tagger. The rule-based module was added to the tagger as it has manually developed rules, which could be 
blocked, edited or new rule could be added. The other reason for this was that the incorrect results of the probabilistic 
taggers were difficult to interpret, as they depended completely on the training corpus used. POS tagset used by this system 
was BNC C5 tagset. A rule-based phrase chunking was used, i.e. a set of rules were defined that described which POS tag 
sequences would constitute a phrase. It then applied manually developed grammar checking rules on the POS tagged and 
phrase chunked text. Pattern matching grammar checking rules were used with patterns designed to match a sequence of 
words, POS tags, or chunk tags. If such a pattern was found in the input text, the input is termed as erroneous. An error 
message was displayed explaining what was wrong in the input, suggestions (if possible) to correct the error and example 
sentences displaying an incorrect and a correct sentence, for the particular error. There were 54 grammar rules, 81 false 
friend pairs, 5 style rules, and 4 built-in Python rules in this style and grammar checker. 

Yüklə 199,88 Kb.

Dostları ilə paylaş:
1   2   3   4   5




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©www.genderi.org 2024
rəhbərliyinə müraciət

    Ana səhifə