Write rules in Japanese

Write rules in Japanese

Supported sentence structures

The Japanese parser supports two kinds of sentences:

  1. Verbless sentences
    An example of a verbless sentence is 彼の行動は法律的に正しかった(His action was legal).
  2. Subject – Object – Verb (SOV) sentences
    An example of a SOV sentence is 当人は子供が5人以上いる(The person has more than five children).

Supported verb forms

Japanese verbs are inflected for politeness level, tense, aspect, voice and sense.

The verb dictionary provides the plain (colloquial) and the polite forms of the verbs.

There are only two tenses in Japanese, past and non-past. The non-past covers both the present and the future tense.

The verb aspect denotes the conjugations for perfect, progressive and potential forms. The perfect aspect is the stative form of the verb.

The verb voice refers to whether the verb is an active or passive mode.

The verb sense indicates whether the verb inflects for a positive or a negative statement. For each of the above, the verbs are inflected by suffixing some ending based on which verb group they belong to.

The verbs do not inflect for gender or person.

The copula だ (da) which is the infinitive form of  です (desu), and である (dearu) which is the infinitive form of であります, have been included in the verbs list.

For compound verbs where only the second verb is inflected, eg benkyo + suru, suru is taken to be the active verb. For such noun + suru verbs, there is no need to enter the compound verbs separately as long as suru is in the verbs list.

 

The following are the verb forms present in the verb dictionary:

The automatic verb conjugations works for the majority of the ichidan and godan verbs. The conjugations for irregular verbs, and verbs where the use of kanji character introduces ambiguity as to whether the verb is ichidan or godan, will have to be entered manually. See Configure list of recognized verbs for more information.

Verb recognition

The active verb in a sentence is recognized based on the dictionary. When a compound verb is present, the active verb is selected based on the longest match.

For example, verbs nakatta (なかった) and kawa nakatta (買った/買わなかった) are both present in the dictionary. In this case, if a sentence has kawa nakatta as its active verb (ie the verb at the end), the parser will recognise the compound verb kawa nakatta instead of just nakatta.

In cases where the sentence uses a compound verb, where the compound verb itself has not been entered in the dictionary, the parser will try to recognize the longest match it can find. For example, if nakatta is in the dictionary, and the verb shitagawa nakatta is not in the dictionary then the parses generated for the sentence containing shitagawa nakatta will be based on the conjugations of the verb nakatta. To avoid this problem, you need to add the missing verb.

Adjectives

In an SOV sentence, the verb at the end is taken to be the active verb. If adjectives are present within the sentence, they are not inflected.

In a verbless sentence, the adjectives may be inflected. There are two form of Japanese adjectives, the -na adjectives and the -i adjectives.

In both the above scenarios and also for an SOV sentence, when the uncertain form is constructed the copula is omitted.

Limitations

The following verb inflections are currently not handled.

  1. Presumptive mood – expresses probability, belief or intention (~daro/~desho forms)
  2. Imperative mood – expresses commands
  3. Causative mood – conveys the idea of making or causing someone to do something
  4. Conditional mood – conveys 'if',  'unless', 'when' meaning (~eba/~tara/~nara/~to forms)
  5. Clauses – conveys sequential, parallel or causal relationships (such as the ~te and ~de forms)
  6. Necessity – expresses 'must' or 'necessity' using the to-ikenai form (といけない)
  7. Counter words

The first three forms are unlikely to occur in the Oracle Policy Automation rulebase framework. For the fourth and fifth verb forms, Oracle Policy Automation has an existing framework for expressing conditionals and clausal relationships when developing a rulebase. As such, these verb inflections are redundant. For the sixth form, expressing 'must', the sentence should be rephrased, for example using the verb 'obligated'. The parser only supports limited number of counter words such as those for age and number of people.

For example, look at the following sentences.

Example 1 - Conditional mood

The person is eligible if the person pays tax.

当人は税金を払ったら、適格である。

 

In Oracle Policy Modeling this should be written as two separate sentences where the first one is formatted as the conclusion and the second one as the level 1 condition.

 

The person is eligible.

当人は適格である。

The person pays tax.

当人は税金を払います。

 

Example 2 - Clauses

The person is retired and the person’s age is greater than 65.

当人は退職していて、(年齢が)65歳以上である。

 

The above sentence should be broken down into two separate discrete sentences.

 

The person is retired and

当人は退職している。および

The person’s age is greater than 65

当人は(年齢が)65歳以上である。

 

Here the sentences represent two conditions that need to occur simultaneously. This will be reflected by the 'and' rather than inflecting the verb to the -te form. Thus, if there are sentences where verb forms that are not covered by the verb editor are used, you should try to rewrite them as separate attributes especially when the sentences are clausal in nature.

Example 3 - Necessity

The parser provides the nakere nara bai (なければならない) form for expressing the notion of 'must' or necessity. This form conjugates only for past and present tense; no conjugations are required for politeness level. If this form does not suit the sentence being expressed in Oracle Policy Modeling, then the sentence can be restructured as follows.

Sentences can be rephrased to use a noun + copula form. Another way is to simply rephrase the sentences. For example,

'A person must have a pension card' changes to

'A person owns a pension card'.