Vol. 10, Issue 1, (issn: 2278-7720)

Yüklə 199,88 Kb.

Pdf görüntüsü

səhifə	2/5
tarix	08.06.2022
ölçüsü	199,88 Kb.
	#89132

1 2 3 4 5

CRITIQUE
) to detect the grammar errors and style weaknesses in English texts was proposed by
Ravin (1998)[6] and was developed at IBM Thomas J. Watson Research Center. Style checking was performed only after
the grammar checking. Parser containing about 200 phrase structure rules was used for syntactic analysis of sentences in the
input. The style component of CRITIQUE had more than 300 phrase structure rules to detect style weaknesses.
A grammar checker for Korean language was developed by Young-Soog (1998)[7]. Partial parsing was used to detect
grammatical errors. Since Korean is a partially free word order language, therefore, the grammar used for parser is
dependency grammar. A correction rule table was used to suggest corrections. In order to prevent excessive creation of
candidate words for error replacement, this system used a high frequency word dictionary derived from corpus and part-of-
speech pattern. The system was reported to have achieved an average precision of 99.05% and an average recall of 95.98%.
A grammar corrector for Danish was developed by Paggio (2000)[8]. A full parser was used to detect the errors. The
grammar of the parser was an augmented context-free grammar consisting of rewrite rules where symbols were associated
with features. It used error rules to detect structural errors. The error rules contained error messages and error weight
associated with them. In this way, if a particular error rule detected an error in the text then it could show a useful message to
help the user correct the error. When this system was tested against a test corpus (having grammatical errors mixed in
randomly chosen text), this system reached 58.1% error coverage as compared to 53.5% for Microsoft Word, the precision
reported was 20.6% for this system and 15.9% for Microsoft Word on that same test corpus.

www.ijcait.com International Journal of Computer Applications & Information Technology
Vol. 10, Issue 1, (ISSN: 2278-7720)
P a g e
| 218
A commercial grammar checking system for integration in Microsoft Word for French, German, and Spanish languages was
developed by Helfrich and Music (2000)[9] at Microsoft Corporation. This system excluded some obvious errors that are
simple to detect but most users don’t bother about them. The authors presented the design process used to find out the
features or errors that the grammar checker needed to cover and then in evaluation point out that how important it is to keep
false alarms close to zero. They suggested the use of highly edited documents for testing to achieve false alarm count close
to zero. As this system was commercial software, therefore, the inner details of the system’s working were not presented.

Another grammar checking system to be used as a part of project on developing a Computer Assisted Language Learning
(CALL) system for French as a foreign language was developed by Vandeventer (2001)[10]. Government and binding
theory based syntactic parsing system for French – FIPS was used to develop this system. This grammar checking system
worked by relaxing three constraints for agreement – gender, number, and person. The parser used was based on chart
parsing algorithm and returned partial parses in the form of chunks for the sentences that failed to go through complete
analysis.
Another grammar checking system for Swedish was provided by Carlberger et al. (2002, 2004)[11]. This grammar checking
system combined both the probabilistic and rule-based methods to achieve high efficiency and robustness. Error rules were
used to detect various grammatical errors and to give suggestions. HMM (Hidden Markov Models) based part-of-speech
tagger is used in this system. This system had 200 scrutinizing rules and 50 help rules.
A grammar checker to detect agreement error in noun phrase of German texts was developed by Fliedner (2002)[12]. A
finite state automata based shallow parsing along with constraint relaxation was used to detect agreement errors in noun
phrases. All the words in a noun phrase needed to agree in terms of number, gender and case. The precision and recall of this
system was around 67%.
A grammar checker for second language learners of Swedish was discussed by Kann (2002)[13] and Bigert et al. (2004)[14].
This system was an extension of Granska system developed by Carlberger et al. (2002)[16] and used a hybrid approach. In
hybrid approach, a combination of three approaches – manually constructed error rules, based on POS trigram frequencies
from a tagged corpus, and machine learning of automatically constructed errors was used. As all these approaches are
focused on detecting fairly different set of errors, so a combination of these approaches gave better results.

A grammar checking system based on two pass parsing approach for Urdu was presented by Kabir et al. (2002)[18]. In the
first pass, the sentence was parsed using basic phrase structure grammar rules and if it failed to get completely parsed, then
movement rules were applied to convert the sentence into its desired base form and then reparsed to check for errors.
Movement rules were used to convert the input sentence into the form recognized as base form. If the input sentence failed to
get parsed in the first pass and also no movement rules could be applied then it meant that the structure of the sentence was
probably incorrect, thus, a structural error was flagged. The phrase structure grammar rules were designed only for base
structure forms or kernel sentences. Only simple declarative sentences in subject, object, and verb (SOV) order were taken
into consideration for grammar checking. The grammatical errors covered by the system were disagreement in terms of
number, gender and case, internal to noun phrases and between noun and verb phrases in a sentence. Structural errors
covered were missing noun, missing verb phrase, misplaced adjective phrase etc. For any detected error, the system provided
corrections and showed the final corrected output to the user.
A purely rule-based open source grammar and style checker for English was discussed by Naber (2003). QTAG (a freely
available probabilistic part-of-speech tagger for non-commercial use, described by Tufis and Manson (1998)) was used for
part-of-speech tagging along with a rule-based module to help the tagger by eliminating some of the ambiguous tags before
sending it to the tagger. The rule-based module was added to the tagger as it has manually developed rules, which could be
blocked, edited or new rule could be added. The other reason for this was that the incorrect results of the probabilistic
taggers were difficult to interpret, as they depended completely on the training corpus used. POS tagset used by this system
was BNC C5 tagset. A rule-based phrase chunking was used, i.e. a set of rules were defined that described which POS tag
sequences would constitute a phrase. It then applied manually developed grammar checking rules on the POS tagged and
phrase chunked text. Pattern matching grammar checking rules were used with patterns designed to match a sequence of
words, POS tags, or chunk tags. If such a pattern was found in the input text, the input is termed as erroneous. An error
message was displayed explaining what was wrong in the input, suggestions (if possible) to correct the error and example
sentences displaying an incorrect and a correct sentence, for the particular error. There were 54 grammar rules, 81 false
friend pairs, 5 style rules, and 4 built-in Python rules in this style and grammar checker.

Yüklə 199,88 Kb.

Dostları ilə paylaş:

1 2 3 4 5